🧠 How AI Really Works

An Interactive Journey Inside Neural Networks

What is a Neural Network?

Imagine your brain is made of billions of tiny decision-makers called neurons. Each neuron:

  • 🎯 Takes in information (inputs)
  • 🤔 Thinks about it (processing)
  • 💡 Makes a decision (output)

An AI neural network works the same way! It's like a simplified brain made of math. Let's see it in action!

A neural network is a function approximator that transforms inputs through layers of neurons:

f(x) = σ(W₃ · σ(W₂ · σ(W₁ · x + b₁) + b₂) + b₃)

Where:

  • x = input vector
  • Wᵢ = weight matrix for layer i
  • bᵢ = bias vector for layer i
  • σ = activation function (e.g., ReLU, sigmoid)

🎮 Live XOR Training Demo

Watch an AI learn the XOR problem in real-time! XOR outputs 1 when inputs are different, 0 when same.

Epoch
0
Loss
1.000
Accuracy
0%
Learning Rate
0.1

How Does Learning Work?

🎯 Forward Pass: Making Predictions

The network makes a prediction by passing data forward through each layer:

  1. Input: Feed in the data (like 0,1 for XOR)
  2. Multiply & Add: Each connection has a "strength" (weight)
  3. Activate: Decide if the neuron should "fire"
  4. Output: Get the final prediction

📉 Backward Pass: Learning from Mistakes

When the network is wrong, it learns by adjusting its connections:

  1. Calculate Error: How wrong was the prediction?
  2. Blame Game: Which connections caused the error?
  3. Adjust Weights: Make connections stronger or weaker
  4. Repeat: Try again with new weights!

Forward Propagation

For each layer l:

z[l] = W[l] · a[l-1] + b[l]
a[l] = σ(z[l])

Where a[0] = x (input) and a[L] = ŷ (output)

Backpropagation

Loss function (Mean Squared Error):

L = ½ Σ(y - ŷ)²

Gradient computation:

δ[L] = ∇ₐL ⊙ σ'(z[L])
δ[l] = (W[l+1]ᵀ · δ[l+1]) ⊙ σ'(z[l])

Weight update:

W[l] = W[l] - α · δ[l] · a[l-1]ᵀ
b[l] = b[l] - α · δ[l]

Key Components Explained

🔗 Weights & Biases

Weights are like volume knobs - they control how much each input matters.

Biases are like thresholds - they decide when a neuron should activate.

⚡ Activation Functions

These decide if a neuron should "fire" or not:

  • ReLU: If positive, pass it on. If negative, block it!
  • Sigmoid: Squash everything between 0 and 1
  • Tanh: Squash everything between -1 and 1

🎯 Gradient Descent

Imagine you're blindfolded on a hill, trying to reach the bottom:

  1. Feel the slope around you (calculate gradient)
  2. Take a small step downhill (adjust weights)
  3. Repeat until you reach the bottom (minimum loss)

Activation Functions

ReLU:

f(x) = max(0, x)
f'(x) = {1 if x > 0, 0 if x ≤ 0}

Sigmoid:

σ(x) = 1 / (1 + e⁻ˣ)
σ'(x) = σ(x) · (1 - σ(x))

Tanh:

tanh(x) = (eˣ - e⁻ˣ) / (eˣ + e⁻ˣ)
tanh'(x) = 1 - tanh²(x)

Gradient Descent Update Rule

θₜ₊₁ = θₜ - α · ∇θ L(θₜ)

Where:

  • θ = parameters (weights and biases)
  • α = learning rate
  • ∇θ L = gradient of loss with respect to parameters