Activation Functions & Optimization

Aim

To study and compare different activation functions and optimization algorithms by training the same MLP on the Fashion-MNIST dataset using ReLU, Sigmoid, and Tanh activations with SGD and Adam optimizers, and analysing their impact on training dynamics through overlaid loss/accuracy curves and gradient-flow visualizations.