Activation Function Comparator

Compare sigmoid, ReLU, tanh, and other activation functions side by side

Select Functions to Compare

Sigmoid
1/(1+e^-x)
Tanh
tanh(x)
ReLU
max(0,x)
Leaky ReLU
max(0.01x,x)
ELU
x or α(e^x-1)
Swish
x·σ(x)
GELU
x·Φ(x)

Visualization Controls

Activation Functions

Derivatives

Function Properties

About Activation Functions

Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns. Without activation functions, even deep networks would be equivalent to simple linear models.

Sigmoid & Tanh: Classic functions with bounded outputs. Suffer from vanishing gradients in deep networks.
ReLU: Most popular for hidden layers. Fast and effective but can have "dying neuron" problem.
Leaky ReLU & ELU: Variants that address ReLU's limitations by allowing small negative values.
Swish & GELU: Modern functions that are smooth and often perform better than ReLU.

Tips: Select multiple functions to compare their shapes and derivatives. Notice how ReLU and its variants have constant gradients (no vanishing), while sigmoid and tanh derivatives approach zero for large |x|.