5.2 Multiple Linear Regression
Simply regression with more than one independent variable.
The diagnostics tools are the same as before, significance testing, R square, DW stats, residuals diagnostics.
5.2.1 Multicollinearity
Although in MLR one must be aware of multicollinearity, meaning that do we see a strong relationship between independent variables, hence are they explaining the same?
To assess for multicollinearity one can apply VIF, which is the following:
\[\begin{equation} VIF_j=\frac{1}{1-R_j^2} \tag{5.7} \end{equation}\]
Where \(j = 1,...,k\)
Thus, we see that Rsquare is obtained from regression each IDV against the remaining variables. We can then have the following outputs:
- VIF = 1, no milticollinearity
- VIF > 10, indicates multicollinearity
If one gets an indication of multicollinearity, then one should drop one of the correlated variables.
5.2.2 Serial correlation and omitted variables
When doing regression, we may observe that the IDVs are correlated with the error term, meaning that the errors are not randomly distributed, hence serial correlation in the error terms.
For serial correlation in the error terms, we are able to make use of the Durbin-Watson, see (5.6).
5.2.3 Selection criteria
We cannot use R square anymore, as it will never really penalize when we are adding variables. Hence one should use>
- AIC
- BIC
Depending on whether one is interested in the best model for prediction or the true model.