Understand various matrix operations, matrix decompositions, factorization and related operations
Singular Value Decomposition (SVD)
Singular Value Decomposition (SVD) is a powerful matrix factorization that generalizes the concept of eigendecomposition to any \( m \times n \) matrix. Geometrically, SVD decomposes a linear transformation into three distinct steps: a rotation in the input space, a rescaling along the principal axes, and a second rotation in the output space.
Definition
For any matrix \( A \in \mathbb{R}^{m \times n} \), the SVD is defined as:
\[ A = U \Sigma V^T \]Where:
- \( U \): An \( m \times m \) orthogonal matrix. Its columns form an orthonormal basis of the output space.
- \( \Sigma \): An \( m \times n \) rectangular diagonal matrix. The diagonal entries \( \sigma_1 \geq \sigma_2 \geq \dots \geq 0 \) are the singular values, representing the magnitude of scaling along each axis.
- \( V \): An \( n \times n \) orthogonal matrix. Its columns form an orthonormal basis of the input space.
The Role of \( A^T A \) and \( A A^T \)
To compute the SVD, we analyze the symmetric matrices \( A^T A \) and \( A A^T \). These matrices are positive semi-definite and share the same non-zero eigenvalues, which are the squares of the singular values (\( \sigma_i^2 \)):
- \( A^T A \) (size \( n \times n \)) has eigenvectors that form the columns of \( V \).
- \( A A^T \) (size \( m \times m \)) has eigenvectors that form the columns of \( U \).
Example
Consider the matrix:
\[ A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \]Step 1: Compute \( A^T A \) and \( A A^T \)
First, compute the transpose of \( A \):
\[ A^T = \begin{bmatrix} 1 & 3 \\ 2 & 4 \end{bmatrix} \]Now compute:
\[ A^T A = \begin{bmatrix} 1 & 3 \\ 2 & 4 \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} 10 & 14 \\ 14 & 20 \end{bmatrix}, \quad A A^T = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} 1 & 3 \\ 2 & 4 \end{bmatrix} = \begin{bmatrix} 5 & 11 \\ 11 & 25 \end{bmatrix} \]These symmetric matrices are used to compute singular values and the columns of \( U \) and \( V \).
Step 2: Compute Singular Values
Singular values are the positive square roots of the eigenvalues of \( A^T A \). Solve:
\[ \det(A^T A - \lambda I) = 0 \] \[ \begin{vmatrix} 10 - \lambda & 14 \\ 14 & 20 - \lambda \end{vmatrix} = (10-\lambda)(20-\lambda)-196 = \lambda^2 - 30\lambda + 4 \]Solving this quadratic equation gives:
\[ \lambda_1 \approx 29.866, \quad \lambda_2 \approx 0.134 \]Hence, the singular values are:
\[ \sigma_1 = \sqrt{29.866} \approx 5.466, \quad \sigma_2 = \sqrt{0.134} \approx 0.366 \]Step 3: Compute Matrix \( V \)
The columns of \( V \) are the normalized eigenvectors of \( A^T A \).
\[ (A^T A - \lambda_1 I)\vec{v}_1 = 0 \] \[ \begin{bmatrix} -19.866 & 14 \\ 14 & -9.866 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = 0 \quad \Rightarrow \quad y = 1.419 x \]Normalize:
\[ \vec{v}_1 \approx \begin{bmatrix} 0.576 \\ 0.817 \end{bmatrix}, \quad \vec{v}_2 \approx \begin{bmatrix} 0.817 \\ -0.576 \end{bmatrix} \] \[ V = \begin{bmatrix} 0.576 & 0.817 \\ 0.817 & -0.576 \end{bmatrix} \]Step 4: Compute Matrix \( U \)
The eigenvalues of the matrices \( A^T A \) and \( A A^T \) are equal to the squares of the singular values of \( A \). This follows from the relation \( A \vec{v}_i = \sigma_i \vec{u}_i \). Multiplying both sides by \( A^T \) gives \( A^T A \vec{v}_i = \sigma_i^2 \vec{v}_i \), which shows that \( \vec{v}_i \) is an eigenvector of \( A^T A \) with eigenvalue \( \lambda_i = \sigma_i^2 \). Similarly, multiplying by \( A \) leads to \( A A^T \vec{u}_i = \sigma_i^2 \vec{u}_i \), so \( \vec{u}_i \) is an eigenvector of \( A A^T \) with the same eigenvalue. Hence, the eigenvalues of both matrices are the squares of the singular values.
The columns of \( U \) are obtained via:
\[ \vec{u}_i = \frac{1}{\sigma_i} A \vec{v}_i \] \[ A \vec{v}_1 = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} 0.576 \\ 0.817 \end{bmatrix} \approx \begin{bmatrix} 2.210 \\ 4.996 \end{bmatrix}, \quad \vec{u}_1 = \frac{1}{5.466} \begin{bmatrix} 2.210 \\ 4.996 \end{bmatrix} \approx \begin{bmatrix} 0.404 \\ 0.915 \end{bmatrix} \] \[ \vec{u}_2 \approx \begin{bmatrix} -0.915 \\ 0.404 \end{bmatrix}, \quad U = \begin{bmatrix} 0.404 & -0.915 \\ 0.915 & 0.404 \end{bmatrix} \]Step 5: Construct the SVD
\[ \Sigma = \begin{bmatrix} 5.466 & 0 \\ 0 & 0.366 \end{bmatrix}, \quad A = U \Sigma V^T \]This decomposition can be verified by multiplying \( U \Sigma V^T \), which reconstructs \( A \).
Summary
- Existence: SVD exists for every matrix, unlike eigendecomposition.
- Uniqueness: Singular values are unique. Singular vectors are unique up to a sign flip, but their subspaces are fixed.
- Applications: SVD is used in PCA, pseudoinverse computation, and low-rank matrix approximation.