Activation Function Comparator

Select Functions to Compare

Sigmoid

1/(1+e^-x)

Tanh

tanh(x)

ReLU

max(0,x)

Leaky ReLU

max(0.01x,x)

ELU

x or α(e^x-1)

Swish

x·σ(x)

GELU

x·Φ(x)

Visualization Controls

X-axis Range: ±5

Show Grid Lines

Highlight Zero

Activation Functions

Derivatives

Function Properties

About Activation Functions

Activation functions introduce non-linearity into neural networks, enabling them to learn complex patterns. Without activation functions, even deep networks would be equivalent to simple linear models.

Sigmoid & Tanh: Classic functions with bounded outputs. Suffer from vanishing gradients in deep networks.
ReLU: Most popular for hidden layers. Fast and effective but can have "dying neuron" problem.
Leaky ReLU & ELU: Variants that address ReLU's limitations by allowing small negative values.
Swish & GELU: Modern functions that are smooth and often perform better than ReLU.

Tips: Select multiple functions to compare their shapes and derivatives. Notice how ReLU and its variants have constant gradients (no vanishing), while sigmoid and tanh derivatives approach zero for large |x|.