7.1 ARIMA
Basically it is a framework for adding AR and or MA into the regression model, where:
- AR: A time-series is predicted using its own history, can be explained by: \(Y_t=β_0+β_1Y_{t-1}+ε_1\)
- MA: Model predicts based on current and past shocks to the series. An MA(q) model: \(Y_t=\mu+\epsilon_t-\omega_1\ \epsilon_{e-1}-...-\omega_q\ \epsilon_{e-q}\)
It is important to notice, that it is not by default an advantage of including both methods, but one should select the appropriate method depending on the looks correlogram
ARIMA is specifically good for short term forecasts as it utilize previous observations to forecast future values.
It is able to represent both stationary and nonstationary data.
Often, one encounter difficulties making correct model specifications, meaning meeting the assumptions. This is crucial for predicting reliable forecasts
Key concepts
ARMA vs. ARIMA: ARMA is an approach to stationary data. If the data is nonstationary, one must make the data stationary by differencing. leading to ARIMA. Hence: 1. ARMA: When we have stationarity 2. ARIMA: When we have non-stationarity
Random walk: this is merely when you have an 0,1,0 ARIMA, hence only differencing. Thus no AR and MA
Drift: that is a constant that can be added to the equation. It will make the model tend upwards or downwards. Imagine having an ARIMA 0,1,0 with a drift. Then you will only have the most recent \(y_{t-1}\) and then you will have the constant, which will force the forecast upwards if positive and vise verca.
The notations for ARIMA are:
\[\begin{equation} ARIMA_{(p,d,q)} \tag{7.1} \end{equation}\]
Where:
p = the order of AR
d = the order of differencing (integration) (NOTE: if this is 0, then the model is reduced to an ARMA model)
- if we only have differencing, then we call this a random walk
- this is I, as it is also called Integration order
q = the order of MA, hence if q = 2, then MA for \(t_{-1}\) and \(t_{-2}\) is included in the regression model.
7.1.1 Elaborating on AR models
AR = Autoregressive
AR models are appropriate with stationary data
This makes sense, as if the data is non stationary (where the variance is constant)
This can generalized with AR(p), where p = the order of AR, meaning how many prior periods to \(Y_t\) to be included. Then how do we select an apporpriate number of lags?
- The autocorrelation function (ACF): selection criteria: the ACF should decline to zero exponentially fast
- The partial autocorelation function (PACF)
See the full equation on page 360.
Process:
- Select order of p
- Calculate coefficients for each lagged period
- Forecast using the coefficients
- Evaluate constantly if the coefficients are still applicable
Assumptions
- Data is stationary. If not, then one must deal with that
7.1.2 Elaborating on MA models
MA = Moving average
IMPORTANT NOTE: this has nothing to do with regular Moving Average, as with using past periods
This approach applies use residuals (between actual values and fitted values/or forecasted values) multiplied with a coefficient to forecast the coming period. As with AR, we are able to include x amount of previous periods. Thus, we are able to describe MA with:
\[\begin{equation} Y_t=\mu + \epsilon_t - \omega_1 \epsilon_{t-1} - \omega_2 \epsilon_{t-2} -...- \omega_q \epsilon_{t-q} \tag{7.2} \end{equation}\]
where:
- \(Y_t\) = the forecast for time period t.
- \(\mu\) = just a constant that is applied in the calculation
- \(\epsilon_t\) = the error term as in any other regression
- \(\omega\) = the coefficients that we are to calculate for each period
- NOTICE: these can be interpret as wheights put on each period. But it does NOT need to summarize to 1, it can be above and below
- \(\epsilon_{1-q}\) = The error (residual) for each period
Process:
- Select order of 1
- Calculate coefficients for each lagged period
- Forecast using the coefficients
- Evaluate constantly if the coefficients are still applicable
Then how do we select an apporpriate number of lags to be included?
- The autocorrelation function (ACF): selection criteria: the ACF should decline to zero exponentially fast
- The partial autocorelation function (PACF)
7.1.3 Elaborating on Integration models
This corresponds with the I in the ARIMA.
If we have a model that only contain I and not AR and MA, then we have a random walk. Meaning that we are left with:
\[\begin{equation} y_t=y_{t-1}+\ \epsilon_t \tag{7.3} \end{equation}\]
We see that \(y_t\) can be defined by the previous observation, hence \(y_{t-1}\) + some randomness, which is explained by the \(e_t\), for period t.
Naturally we are able to have a drift and a trend in the random walk, that would generate the following:
RW with a drift:
\[\begin{equation} y_t= \beta_0+y_{t-1}+\ \epsilon_t \tag{7.4} \end{equation}\]
Hence we see that the drift is added with a constant, that can be compared with the interception in a normal regression. Hence, if \(y_{t-1}\) is 0 and the error is 0, then you will have the constant, which will always be there.
And the RW with a drift and a trend:
\[\begin{equation} y_t= \beta_0+\beta_1t+y_{t-1}+\ \epsilon_t \tag{7.5} \end{equation}\]
Hence we add a coefficient that as multiplied with the time period.