Picture a page covered in dots of two colours, blue and orange, and your job is to draw one straight line that keeps the two colours apart. The tiniest neural network is a single tiny decision-maker trying to do exactly that. It starts by drawing the line in a totally random, wrong place. Then it looks at the dots one at a time and asks: did I put this dot on the correct side? Every time the answer is no, it nudges the line a little toward fixing that mistake. One nudge barely helps, but do it hundreds of times and the line slides into just the right spot. That is all learning is here: correcting mistakes, over and over. Place some dots of each colour in the simulator, press train, and watch the line settle.
Most people think a neural network understands what it looks at. In fact the simplest one just draws a line, checks which dots are on the wrong side, and nudges the line a little, hundreds of times, until it fits.
What's actually happening
The phrase neural network sounds like it must involve understanding, or a tiny brain that grasps what it is looking at. The simplest one does nothing of the sort. It is a single artificial neuron, and all it does is take its inputs, multiply each by a number called a weight, add them up, and fire if the total clears a threshold. Geometrically, that rule carves the space in two with a single straight line: dots on one side get one answer, dots on the other get the other. The whole machine is that line, and the weights are just the knobs that decide where the line sits and which way it tilts.
Learning is the unglamorous business of turning those knobs. The neuron starts with random weights, so its line lands somewhere useless, cutting straight through both colours of dots. Then it goes through the examples and, for every dot it places on the wrong side, it shifts the weights by a small step in the direction that would have got that dot right. This is the perceptron rule, the granddaddy of training rules, invented by Frank Rosenblatt in 1958. Each correction is tiny and a little selfish, fixing one dot while possibly disturbing others, but averaged over hundreds of passes the line drifts steadily toward the arrangement that satisfies as many dots as it can. There is a small guarantee hiding here: if a single straight line can separate the two groups at all, this nudging is mathematically certain to find one.
And that guarantee comes with the catch that shaped the whole field. A single neuron can only ever draw a straight line, so if the two groups are tangled in a way no line can separate, it is helpless. The classic example is the XOR pattern, two colours placed in opposite corners, which no straight line can split. That limitation nearly killed neural networks in the 1960s. The escape was to stack neurons into layers, letting later neurons bend and combine the lines drawn by earlier ones into curved, intricate boundaries. Every giant modern network, the ones that recognise faces or write text, is built from millions of these same humble units, each still just nudging its weights to correct its mistakes.
Learning here is only correcting mistakes by nudging weights, and stacking these humble units is what builds every giant network.
- 1Scatter two colours of coins or buttons on a table so a single straight line could mostly separate them, then lay a piece of string across as your first random guess.
- 2Find a coin on the wrong side of the string and nudge the string a little so that coin ends up on the correct side. Repeat for the next wrongly-placed coin.
- 3After a dozen nudges the string settles into a good dividing line. You have just run the same correct-the-mistakes loop a real neuron uses to learn.
Common questions
It starts with random weights, then for every example it places on the wrong side it shifts the weights a small step toward getting that one right. This is the perceptron rule, and averaged over hundreds of passes the line drifts into place.
A single neuron can only ever draw one straight line. If the two groups are tangled so no line can separate them, such as the XOR pattern in opposite corners, it is helpless.
They stack neurons into layers, letting later neurons bend and combine the lines drawn by earlier ones into curved, intricate boundaries. Each unit still does nothing but a weighted sum and a nudge.