Foundations

How Networks Learn

Training, backpropagation, and finding the right weights through iterative improvement

From Multi-Layer Networks: You've seen how signals flow through a network. But how does the network learn the right weights? That's what training is about.

Learning = Reducing Errors

How does a network learn the right weights? By adjusting them based on errors. The network makes a prediction, compares it to the correct answer, and updates weights to reduce mistakes. Repeat this millions of times, and "intelligence" emerges.

1. The Training Process

Watch how a network improves over time. The "loss" measures how wrong the predictions are - lower is better. Training is the process of reducing this loss.

Predict the Outcome

We're about to train a neural network on a simple pattern. Over 50 training epochs, the network will adjust its weights to reduce errors.

What do you think will happen to the loss (error) over time?

Will it decrease steadily?
Stay flat?
Bounce around randomly?

Making a prediction helps you notice patterns more clearly

Key insight: Training is iterative - thousands of small weight adjustments add up to intelligent behavior.

2. The Punchline: Universal Approximation

Here's the magic: neural networks can learn to approximate ANY pattern given enough neurons and data. Watch the network learn different functions.

Neural Network Function Approximation

Watch a neural network learn to fit different functions

Smooth periodic function - networks learn this pattern well

Epoch: 0

Loss: 0.5999

Hidden neurons: 8

Progress:

(click to jump)

🧠

The Universal Approximation Theorem

Neural networks are universal function approximators. Given enough neurons, a network can approximate ANY continuous function to arbitrary precision.

Sine wave: Smooth periodic function - networks learn this easily
Step function: Sharp discontinuity - harder, but approximated with enough neurons
Gaussian: Bell curve - natural fit for neural networks

This is why neural nets are so powerful: they don't need you to specify the pattern - they discover it from data.

The Universal Approximation Theorem: A neural network with even one hidden layer (with enough neurons) can approximate any continuous function. This is why neural networks are so powerful - they're universal learners.

What you learned

•Training is iterative: make prediction → measure error → adjust weights → repeat
•Backpropagation figures out how much each weight contributed to the error
•Neural networks can learn any pattern - they're universal function approximators