Feature Scaling Comparator
Overview
The Feature Scaling Comparator lets you visualize how different scaling methods transform your data. Compare StandardScaler (z-score normalization), MinMaxScaler (0-1 range), and RobustScaler (median and IQR-based) side by side. The tool is particularly useful for understanding how each scaler handles outliers and which one is appropriate for your machine learning pipeline.
Tips
- Add outliers to your dataset to see how RobustScaler maintains a better distribution compared to StandardScaler and MinMaxScaler
- Use StandardScaler for algorithms that assume normally distributed features (linear regression, logistic regression, neural networks)
- Choose MinMaxScaler when you need features in a specific range, especially for neural networks with bounded activation functions
- Apply RobustScaler when your data contains outliers that you want to preserve but not let dominate the scaling
- Remember to fit scalers on training data only, then transform both training and test sets using those fitted parameters
- Compare the visual distributions to understand why tree-based models (random forests, XGBoost) don’t require scaling - they’re scale-invariant