Eigenvalues and Eigenvectors
The concepts of Eigenvalues and Eigenvectors are central to many advanced ML techniques, most notably Principal Component Analysis (PCA). They help us understand the fundamental directions of variance and transformation within a dataset.
1. The Core Idea: What Do They Represent?
Imagine a matrix as a machine that transforms or moves vectors in space (a linear transformation). When you apply this transformation to a random vector, the vector usually changes both its magnitude (length) and its direction.
An Eigenvector is a special vector that, when multiplied by the matrix , only changes its magnitude (it gets scaled or stretched/compressed), but its direction remains the same.
The Eigenvalue is the scalar value that represents this scaling factor.
The Eigen-Equation
This relationship is formalized by the core Eigen-equation:
- : The square matrix (the transformation).
- : The Eigenvector (the special vector that doesn't change direction).
- : The Eigenvalue (the scalar factor by which is stretched or compressed).
2. Geometric Analogy: The Spinning Planet
Imagine a spinning planet ( is the rotation matrix). If you stand on any random spot (a vector ), you will move to a new spot (the direction changes).
However, there are two special spots on the planet: the North Pole and the South Pole (the Eigenvectors). If you stand on one of those poles, the planet spins beneath you, but you remain at the same point in space. Your direction relative to the center of the planet doesn't change, only your magnitude (your position on the spin axis) is scaled by 1 (or 0, depending on how you define the axis).
3. Calculation (Conceptual Overview)
To find the eigenvalues () and eigenvectors (), the core equation is rearranged:
We can factor out by substituting for , where is the Identity Matrix:
For a non-zero vector to satisfy this equation, the matrix must be singular, meaning its determinant must be zero:
- Find : Solve this equation (the characteristic equation) to find the scalar eigenvalues .
- Find : Substitute each found back into the original equation and solve for the corresponding eigenvector .
4. Eigenvalues and Eigenvectors in ML: PCA
The primary application of this concept in Machine Learning is Principal Component Analysis (PCA), a technique used for dimensionality reduction.
A. The Covariance Matrix
In PCA, we first calculate the Covariance Matrix of the dataset, . This matrix describes how all the features in the dataset vary with respect to each other.
B. Eigen-Decomposition
We then perform an eigen-decomposition on the covariance matrix :
- The Eigenvectors (): These special vectors are the Principal Components (PCs). They represent the new axes or directions in the data space that capture the maximum variance.
- The Eigenvalues (): The magnitude of each eigenvalue tells us how much variance is captured along its corresponding eigenvector (Principal Component).
C. Dimensionality Reduction
To reduce the dimensionality (e.g., from 100 features to 10), we simply:
- Calculate all Eigenvalues.
- Select the eigenvectors (Principal Components) that correspond to the largest eigenvalues.
- These eigenvectors capture most of the meaningful information (variance) in the dataset, effectively compressing the data while losing minimal information.
Eigenvalues and Eigenvectors help us find the underlying structure of variance in our data. The final topic in Linear Algebra provides another powerful way to decompose and simplify complex matrices.