MLP Forward & Backpropagation Simulator

Interactive visualization of neural network learning on Iris dataset

← Back

Simulation Steps

Color Legend

Blue — Forward signal
Orange — Gradient flow
Green — Active / Updated
Grey — Inactive (ReLU = 0)
Purple — Output layer

Step Explanation

Click a step button to begin visualization.

Configuration

Dataset: Iris
Architecture: 4 → 10 → 8 → 3
Optimizer: Adam
Learning Rate: 0.01
Epochs: 50
Hidden Activation: ReLU
Output Activation: Softmax

1 Input Layer

Input features are fed into the network.
[−0.90, 1.03, −1.34, −1.32]
Normalized values of original [5.1, 3.5, 1.4, 0.2]

2 Hidden Layer 1

Each input is weighted, summed, and passed through ReLU.
z = Σ(wᵢ·xᵢ) + b → a = max(0, z)
Grey neurons = ReLU zeroed this unit

3 Hidden Layer 2

Same weighted-sum + ReLU process on layer 1 outputs.
z = Σ(wᵢ·aᵢ) + b → a = max(0, z)

4 Output Layer

Raw scores converted to class probabilities via Softmax.
softmax(zᵢ) = e^zᵢ / Σ e^zⱼ
Highest probability = predicted class

Understanding the MLP

Input Layer

Input features (sepal length, width, etc.) are normalized and fed into the network. Nodes pass values forward without computation.

Hidden Layers

Neurons compute a weighted sum of inputs plus bias: z = Σ(wᵢ·xᵢ) + b. This lets the network learn complex patterns.

Activation (ReLU)

ReLU clips negative values to zero: a = max(0, z). This non-linearity lets the network learn non-linear decision boundaries.

Backpropagation

Gradients computed layer by layer (right → left) using chain rule: ∂L/∂w = ∂L/∂ŷ × ∂ŷ/∂z × ∂z/∂w. Weights updated: w ← w − η·∂L/∂w.

Output (Softmax)

Converts raw scores into probabilities summing to 1. The class with the highest probability is the prediction.

Loss Function (MSE)

Mean Squared Error: L = (ŷ − y)². Measures how wrong the prediction is. Minimizing this over training drives the network to learn.