5.2 Multiple Linear Regression

Simply regression with more than one independent variable.

The diagnostics tools are the same as before, significance testing, R square, DW stats, residuals diagnostics.

5.2.1 Multicollinearity

Although in MLR one must be aware of multicollinearity, meaning that do we see a strong relationship between independent variables, hence are they explaining the same?

To assess for multicollinearity one can apply VIF, which is the following:

\[\begin{equation} VIF_j=\frac{1}{1-R_j^2} \tag{5.7} \end{equation}\]

Where \(j = 1,...,k\)

Thus, we see that Rsquare is obtained from regression each IDV against the remaining variables. We can then have the following outputs:

  • VIF = 1, no milticollinearity
  • VIF > 10, indicates multicollinearity

If one gets an indication of multicollinearity, then one should drop one of the correlated variables.

5.2.2 Serial correlation and omitted variables

When doing regression, we may observe that the IDVs are correlated with the error term, meaning that the errors are not randomly distributed, hence serial correlation in the error terms.

For serial correlation in the error terms, we are able to make use of the Durbin-Watson, see (5.6).

5.2.3 Selection criteria

We cannot use R square anymore, as it will never really penalize when we are adding variables. Hence one should use>

  • AIC
  • BIC

Depending on whether one is interested in the best model for prediction or the true model.