Editor’s note: Aric is a speaker for ODSC West 2020 this October. Check out his talk, “The Bayesians are Coming! The Bayesians are Coming, to Time Series,” there!
Forecasting has applications across all industries. From needing to predict future values of sales for a product line, energy usage for a power company, to volatility of a portfolio of assets to hedge against risk, forecasting provides needed answers to decision makers.
Popular approaches to time series forecasting are exponential smoothing models and combinations/variations on autoregressive and moving average models called ARIMA models. Both of these families of approaches are typically done with frequentist statistical methodology compared to Bayesian statistical methodology. Theoretical arguments aside, practitioners benefit from having knowledge of both frequentist and Bayesian time series modeling approaches. The more techniques a practitioner has, the better chance they have at providing the best solution to the decision-maker using the forecasts, which is the true end goal.
Let’s briefly compare these two branches of statistical time series through an example where we try to forecast the percentage change in quarterly United States personal consumption expenditure (PCE) – essentially household buying habits based on the price changes in consumer goods and services. The chart on the left displays the quarterly US PCE from Q1 1970 through Q3 2016. The last seven observations – 2015 and 2016 – were removed and used as a hold-out sample.
At its heart, time series analysis basically tries to break down a series of data into two primary components – signal and noise. We extract the signal from the data and repeat this signal into the future while using the noise to estimate variation in our signal. Specifically, in time series we rely on the assumption that the observations at a certain point in time depend on previous observations in time. But how much emphasis do you put on more recent observations as compared to ones further in the past?
Exponential smoothing models as well as ARIMA class models try to answer this question. The simple (or single) ESM applies a weighting scheme on observations that decreases exponentially the further back in time we go,
where O is bounded between 0 and 1. The larger the value of O, the more that the most recent observation is emphasized as seen in the chart on the right. The above exponentially decreasing weights simplify to the following equation:
These models essentially optimize themselves to forecast one time period into the future.
The following R code computes the simple ESM (forecast package in R needed):
mod_esm <- ses(train, initial = "optimal", h = 7)
The forecasts of a simple ESM are rather boring as they are a horizontal line at the forecast for the next time period as we can see below.

The following R code computes the AR model with three lags (forecast package in R needed):
mod_ar <- Arima(train, order = c(3,0,0))
The forecast by the AR(3) model suggests a slight downward projection for future values of PCE as compared to the flat ESM forecast.

We are going to fit a Bayesian AR(3) model since the AR(3) structure seemed to fit our data the best according to model selection techniques. In Bayesian times series analysis, we estimated the final forecasts through Markov Chain Monte Carlo (MCMC) techniques. The general idea without getting into too much math is that we want to simulate stepping through a series of probabilistic events connected to one another. If we draw enough samples from this series of events, we will eventually get a distribution that resembles the distribution of forecasted future values. For each forecasted future value we actually have a distribution of possible future values at that time point! We take the mean of these distributions at each of the future time points to build our forecast.
The following R code computes the AR model with three lags (bsts package in R needed):
ss <- AddAutoAr(list(), train, lags = 3, sdy = 1.5) mod_bar <- bsts(train, state.specification = ss, niter = 20000)
The Bayesian AR(3) model seems to have the closest fit yet of the forecasted models for consumption!


A Teaching Associate Professor in the Institute for Advanced Analytics, Dr. Aric LaBarr is passionate about helping people solve challenges using their data. There he helps design the innovative program to prepare a modern workforce to wisely communicate and handle a data-driven future at the nation’s first Master of Science in Analytics degree program. He teaches courses in predictive modeling, forecasting, simulation, financial analytics, and risk management.
Previously, he was Director and Senior Scientist at Elder Research, where he mentored and led a team of data scientists and software engineers. As director of the Raleigh, NC office he worked closely with clients and partners to solve problems in the fields of banking, consumer product goods, healthcare, and government.
Dr. LaBarr holds a B.S. in economics, as well as a B.S., M.S., and Ph.D. in statistics — all from NC State University.
