Normalization Techniques
In Machine Learning, Normalization is the process of rescaling numeric variables to a strictly defined range most commonly or . Unlike standardization, which is about centered distributions, normalization is about boundaries.
1. When is Normalization Essential?
Normalization is preferred over standardization in specific scenarios:
- Image Processing: Pixel intensities are naturally bounded between 0 and 255. Normalizing them to is standard practice for Convolutional Neural Networks (CNNs).
- Neural Networks: Activation functions like Sigmoid or Tanh are most sensitive in small ranges around zero.
- Algorithms with No Distribution Assumption: When you don't know if your data is Gaussian (Normal), normalization is a safer, non-parametric starting point.
2. Min-Max Scaling
This is the most common form of normalization. It shifts and rescales the data so that the minimum value becomes 0 and the maximum value becomes 1.
The Formula:
- Pros: Preserves the relative distances between values.
- Cons: Extremely sensitive to outliers. If you have one value at 10,000 and the rest at 10, the "normal" data will be squashed into a tiny range (e.g., ).
3. MaxAbs Scaling
MaxAbs scaling divides each value by the maximum absolute value in the feature. This scales the data to the range .
The Formula:
- Best Use Case: Sparse data (data with many zeros). It does not "shift" the data (it doesn't subtract the mean or min), so it preserves sparsity.
- Common in: Text analytics and TF-IDF vectors.
4. Robust Normalization (Quantile Scaling)
If your data has significant outliers, Min-Max scaling will fail. A "Robust" approach uses the Interquartile Range (IQR).
The Formula:
5. Comparison: Normalization vs. Standardization
| Feature | Normalization (Min-Max) | Standardization (Z-Score) |
|---|---|---|
| Range | Fixed or | Not bounded (usually ) |
| Mean/Sigma | Varies | Mean = 0, Std Dev = 1 |
| Outliers | Highly Affected | Less Affected |
| Best For | Neural Networks, Images | Linear Reg, SVM, PCA |
6. Practical Implementation
Using scikit-learn, we can apply these transformations efficiently.
from sklearn.preprocessing import MinMaxScaler, MaxAbsScaler
# Sample Data: Age and Salary
data = [[25, 50000], [30, 80000], [45, 120000]]
# Min-Max Scaling to [0, 1]
min_max = MinMaxScaler()
normalized_data = min_max.fit_transform(data)
# MaxAbs Scaling (Preserves Zeros)
max_abs = MaxAbsScaler()
sparse_friendly_data = max_abs.fit_transform(data)
7. Mathematical Visualisation
References for More Details
-
Scikit-Learn Normalization Guide: Understanding
NormalizervsMinMaxScaler. -
Google Machine Learning Crash Course: Visualizing how normalization helps loss functions converge.
Normalization handles the scale of your numbers, but what if you have too many features? Excess features can confuse a model and lead to "The Curse of Dimensionality."