12.1 Forecasting Methods
Naive forecasts
This is merely the current period is assumed to be the best predictor for the future, hence it can be written as:
\[\begin{equation} \hat{Y}_{t+1}=Y_t \tag{12.1} \end{equation}\]
where, \(Y_t\) = the last period, hence \(\hat{Y}_{t+1}\) = the following period.
Therefore, the error can merely be written as: \(e_t=Y_{t+1}-\hat{Y}_{t+1}\), being the actual amount compared with the foretasted value.
One can make several iterations to account for trending, the growth rate, or seasonal data. Those being:
- \(\hat{Y}_{t+1}=Y_t+(Y_t-Y_{t-1})\), to account for trending data (non stationary data)
- \(\hat{Y}_{t+1}=Y_t * \frac{Y_t}{Y_{t-1}}\), to account for the growth rate, notice that it only assess the growth rate to the prior period.
- \(\hat{Y}_{t+1}=Y_{t-3}+\frac{Y_t-Y_{t-4}}{4}\), to account for quarterly trending data, the periods can naturally be changed by changing the formula, e.g. to 12. but notice, that this is just replicating previous periods, hence also previous seasons
12.1.1 Using Averages
We have the following:
- Simple averages, which merely takes the average of all observations.
- Moving averages, which account for the given time frame. This can be extended by,
- Double moving averages, often seen when you need the center value of a period consisting of an even number of periods, where there is no actual median value, thus one can extend the MA with a double MA.
12.1.1.1 Simple Averages
One may assume that it is sufficient to apply the average of all observations, to predict the next period, hence we can say:
\[\begin{equation} \hat{Y}_{t+1}=\frac{1}{n}\sum^t_{i=1}Y_i \tag{12.2} \end{equation}\]
This is appropriate if the data has shown historical stability, thus without seasons, trends and etc.
12.1.1.2 Moving Average (MA)
One may apply a moving average instead, accounting for k periods, also one could extend this by adding weights. For practical purposes only a k period MA is show:
\[\begin{equation} \hat{Y}_{t+1}=\frac{Y_t+Y_{t-1}+...+Y_{t-k}}{k} \tag{12.3} \end{equation}\]
this may be applied to remove seasonal effect, either by k=4 or 12 if the data is respectively quarterly or monthly
See an example in section 4.4.1.1.2, where a loop is used to forecast. See just an in sample fit in section 4.4.1.1.1 (but notice, that the data should be stationary, the data used in this example is not stationary, so there should have been differenced or detrended.)
12.1.1.3 Double Moving Average
This is simply doing moving averages twice, hence it is an extension of equation (12.3)
As mentioned, often seen when one wants the median when using an even number of periods, e.g. 12 months, hence double MA can be applied.
\[\begin{equation} M_t=\hat{Y}_{t+1}=\frac{Y_t+Y_{t-1}+...+Y_{t-k+1}}{k} \tag{12.4} \end{equation}\]
\[\begin{equation} M'_t=\frac{M_t+M_{t-1}+...+M_{t-k+1}}{k} \tag{12.5} \end{equation}\]
\[\begin{equation} a_t=M_t+\left(M_t-M_t\right) \end{equation}\]
\[\begin{equation} b_t=\frac{2}{k-1}\left(M_t-M'_t\right) \end{equation}\]
Thence we are able to say:
\[\begin{equation} \hat{Y}_{t+p}=a_t+b_t*p \tag{12.6} \end{equation}\]
12.1.2 Linear regressions
- Linear regression with a trend: that is normal linear regression, where the trend is added as a counter, which will account for the trend, given it is linear.
- Linear regression with seasonal dummies and a trend: That is making dummy variables for each period, using
seasonaldummary()
, to have a variable accounting for each period. The you can add a trend variableseq(1,n,1)
, e.g., linear counter, when having linear trend.- See 6.5.5
12.1.3 Non linear regressions
- Non linear regression with trend
- Causal regression
12.1.4 Smoothing methods
12.1.4.1 Exponential smoothing
This is exponentially weighted moving average of all historical values, meaning that the most recent value will be assigned the most weight. Hence we merely add different weights to past periods, thus there is no specific way to adjust for trend and seasonality, which is a limitation of exponential smoothing, it can be written as:
\[\begin{equation} \hat{Y}_{t+1}=\alpha Y_t+\left(1-\alpha\right)\hat{Y}_t \end{equation}\] thus: \[\begin{equation} =\hat{Y}_t+\alpha(Y_t-\hat{Y}_t) \end{equation}\]
Where \(\alpha\) = the smoothing constant, thus is can be between 0 and 1. The higher alpha the largest weight to the most recent observation.
Then how to choose the smoothing parameter \(\alpha\)?
- For stable predictions, choose a high alpha
- For sensitive predictions, choose low alpha
- Test different alpha values and compare the models using the performance measures in section 12.2.
12.1.4.2 Holt’s exponential smoothing
Exponential smoothing method with adjustment for trend, hence we introduce a new tuning parameter, hence we have \(\alpha\) and \(\beta\)
- \(\alpha\) = Weight to the most recent observations
- \(\beta\) = adjustment for trend. R will automatically set this, when beta = TRUE
Hence the smoothing now consists of two elements:
- The level estimate
\[\begin{equation} L_t=\alpha Y_t+\left(1-\alpha\right)\left(L_{t-1}+T_{t-1}\right) \tag{12.7} \end{equation}\]
- The trend estimate
\[\begin{equation} T_t=\beta\left(L_t-L_{t-1}\right)+\left(1-\beta\right)T_{t-1} \tag{12.8} \end{equation}\]
Thus, the forecasting of p periods into the future, can be explained by:
\[\begin{equation} \hat{Y}_{t+p}=L_t+pT_t \tag{12.9} \end{equation}\]
To apply: use HoltWinters()
and select parameters that lowers the performance measurements.
When one assigns large weights the model will become more sensitive to changes in the observed data.
One can either set the initial value to 0 or take the average of the first few observations.
If \(\alpha = \beta\), then we have the Brown’s double exponential smoothing model.
If \(\beta\) = 0, then we merely have a simple exponential smoothing.
12.1.4.3 Winters’ exponential smoothing
Exponential smoothing method with adjustment for trend and seasonality, hence we introduce two new tuning parameters, hence we have \(\alpha\), \(\beta\) (as in Holt’s) and \(\gamma\)
Hence the smoothing now consists of three elements:
- The level estimate
\[\begin{equation} L_t=\alpha\frac{Y_t}{S_{t-s}}+\left(1-\alpha\right)\left(L_{t-1}+T_{t-1}\right) \tag{12.10} \end{equation}\]
- The trend estimate
\[\begin{equation} T_t=\beta\left(L_t-L_{t-1}\right)+\left(1-\beta\right)T_{t-1} \tag{12.11} \end{equation}\]
- The seasonality estimate
\[\begin{equation} S_t=\gamma\frac{Y_t}{L_t}+\left(1-\gamma\right)S_{t-s} \tag{12.12} \end{equation}\]
Thus, the forecasting of p periods into the future, can be explained by:
\[\begin{equation} \hat{Y}_{t+p}=\left(L_t+pT_t\right)S_{t-s+p} \tag{12.13} \end{equation}\]
To apply: use HoltWinters()
and select parameters that lowers the performance measurements.
When one assigns large weights the model will become more sensitive to changes in the observed data.
One can either set the initial value to 0 or take the average of the first few observations.
if \(\beta = \gamma = 0\) the model is merely simple exponential smoothing.
12.1.4.4 Moving Averages, see section 12.1.1.2
12.1.5 ARIMA
Decomposition of the time series
- AR: Autoregressive, giving coefficients to lagged values
- MA: Giving residuals to previous residuals
- ARMA: Combination of AR and MA in a stationary setting
- ARIMA: Combination of AR and MA in a nonstationary setting
12.1.6 Dynamic forecasting
That is when variables are assumed to be affecting each other, hence combination of autoregression (using y series lags) and 1 or more other variables in the regression setting.
We have ADL and VAR.
12.1.6.1 ADL
This is an autoregressive model (AR) with lags of another variable, this yields
\[\begin{equation} ALD_{\left(p,q\right)} \tag{12.14} \end{equation}\]
12.1.6.2 VAR
One is able to construct different linear models, and see how they influence each other. Therefore, it is a good approach to assessing exogeniourity and endogeniourity.
12.1.7 Decompisition
This is decomposing the data into its composing and the assembling \(\hat{y}\) as a composition of e.g., trend and seasonality. Notice that one must be aware of whether y is the product or the sum of the components.