Friday, September 19, 2014

Least Squares, Perfect Multicollinearity, & Estimable Functions

This post is essentially an extension of another recent post on this blog. I'll assume that you've read that post, where I discussed the problem of solving linear equations of the form Ax = y, when the matrix A is singular.

Let's look at how this problem might arise in the context of estimating the coefficients of a linear regression model, y = Xβ + ε. In the previous post, I said:
"Least squares estimation leads to the so-called "normal equations":
                                 
                         X'Xb = X'y  .                                                                (1)

If the regressor matrix, X, has k columns, then (1) is a set of k linear equations in the k unknown elements of β. You'll recall that if X has full column rank, k, then (X'X) also has full rank, k, and so (X'X)-1 is well-defined. We then pre-multiply each side of (1) by (X'X)-1, yielding the familiar least squares estimator for β, namely b = (X'X)-1X'y.
So, as long as we don't have "perfect multicollinearity" among the regressors (the columns of X), we can solve (1), and the least squares estimator is defined. More specifically, a unique estimator for each individual element of β is defined.
What if there is perfect multicollinearity, so that the rank of X, and of (X'X), is less than k? In that case, we can't compute (X'X)-1, we can't solve the normal equations in the usual way, and we can't get a unique estimator for the (full) β vector."
I promised that I'd come back to the statement, "we can't get a unique estimator for the (full) β vector". Now's the time to do that.