How AI works is often a mystery — that's a problem

boem@lemmy.world · 7 months ago

How AI works is often a mystery — that's a problem

General_Effort@lemmy.world · 6 months ago

An Artificial Neural Network isn’t exactly an algorithm. There are algorithms to “run” ANNs, but the ANN itself is really a big bundle of equations.

An ANN has an input layer of neurons and an output layer. Between them are one or more hidden layers. Each neuron in one layer is connected to each neuron in the next layer. Let’s do without hidden layers for a start. Let’s say we are interested in handwriting. We take a little grayscale image of a letter (say, 16*16 pixels) and want to determine if it shows an upper case “A”.

Your input layer would have 16*16= 256 neurons and your output layer just 1. Each input value is a single number representing how bright that pixel is. You take these 256 numbers, multiply each one by another number, representing the strength of the connection between each of the input neurons and the single output neuron. Then you add them up and that value represents the likelihood of the image showing an “A”.

I think that wouldn’t work well (or at all) without a hidden layer but IDK.

The numbers representing the strength of the connections, are the parameters of the model, aka the weights. In this extremely simple case, they can be interpreted easily. If a parameter is large, then that pixel being dark makes it more likely that we have an “A”. If it’s negative, then it’s less likely. Finding these numbers/parameters/weights is what training a model means.

When you add a hidden layer, things get murky. You have an intermediate result and don’t know what it represents.

The impressive AI models take much more input, produce much more diverse output and have many hidden layers. The small ones, you can run on a gaming PC, have several billion parameters. The big ones, like ChatGPT, have several 100 billion. Each of these numbers is potentially involved in creating the output.