Learning Rate Visualizer

Visualize how learning rate affects gradient descent convergence

Overview

The Learning Rate Visualizer shows how the learning rate hyperparameter controls gradient descent optimization. Watch how different learning rates lead to fast convergence, slow progress, oscillation, or complete divergence on various loss landscapes.

Tips

Start with a moderate learning rate (around 0.1) to see optimal convergence, then compare with very small and very large values
Watch the 3D path to see how small learning rates take tiny steps while large ones overshoot the minimum
Monitor the loss curve - it should decrease smoothly for a good learning rate, oscillate for too large, and decrease very slowly for too small
Try different optimization landscapes to see how the ideal learning rate depends on the problem’s geometry
Use step-by-step mode to pause and examine exactly how each gradient descent step moves toward the minimum