Let A be an n × m matrix with n > m that has linearly independent columns.

Consider the eq. Ax = b, where b is *not* in the column space. Then Ax = b cannot be solved. Instead we can aim at minimizing the error (b - Ax).

The vector b can be decomposed as b = p + e, where p is in the column space of A and e is in the nullspace of Aᵀ.

Now we can approximate the "solution" to Ax = b by solving Ax = p. In fact, the solution to Ax = p minimizes the squared error ||b - Ax||².

Fig. from Strang (1993)