Here is interesting linear algebra fact: let be an
matrix and
be a vector such that
. Then for any matrix
,
.
The proof is just basic algebra: .
Why care about this? Let’s imagine that is a (not necessarily symmetric) stochastic matrix, so
. Let
be a low-rank approximation to
(so
consists of all the large singular values, and
consists of all the small singular values). Unfortunately since
is not symmetric, this low-rank approximation doesn’t preserve the eigenvalues of
and so we need not have
. The
can be thought of as a “correction” term such that the resulting matrix is still low-rank, but we’ve preserved one of the eigenvectors of
.