Learning Rate Visualizer

Visualize how learning rate affects gradient descent convergence

Overview

The Learning Rate Visualizer shows how the learning rate hyperparameter controls gradient descent optimization. Watch how different learning rates lead to fast convergence, slow progress, oscillation, or complete divergence on various loss landscapes.

Tips

  1. Start with a moderate learning rate (around 0.1) to see optimal convergence, then compare with very small and very large values
  2. Watch the 3D path to see how small learning rates take tiny steps while large ones overshoot the minimum
  3. Monitor the loss curve - it should decrease smoothly for a good learning rate, oscillate for too large, and decrease very slowly for too small
  4. Try different optimization landscapes to see how the ideal learning rate depends on the problem’s geometry
  5. Use step-by-step mode to pause and examine exactly how each gradient descent step moves toward the minimum