A Bite-Sized Guide to Neural Networks: Unraveling the Magic Sandwich Analogy
An image generated by playground.ai

A Bite-Sized Guide to Neural Networks: Unraveling the Magic Sandwich Analogy

Neural Networks

  • Deep Learning → ANN (Artificial Neural Networks) (Also known as cybernetics, or connectionism)
  • Modelled after the way our brain works → Biological Neural Networks → Original and far more complex
  • ANNs were invented when trying to theorise how the brain works.
  • ANN == Imitation brains
  • Bunch of simple chunks of software, each able to perform very simple math.
  • Each chunk is called "cell" or "neuron"
  • The power of a neural network lies in how these cells are connected.
  • Most common ANNs we create have as many neurons as a worm

No alt text provided for this image

  • Unlike a human, the neural net is at least able to devote its entire one-worm-power brain to the task at hand (if not distracted with extraneous data). But how can we solve problems with a bunch of interconnected cells?
  • The most powerful neural networks have neurons more than a single honeybee (these take months and tens of thousands of dollars to train)
  • ANNs might be able to approach the number of neurons in the human brain by around 2050. Does this mean AI has intelligence of human then? → Not even close
  • Human brain neurons are so complex that each human neuron is more like a complete many-layered neural network all by itself. So rather than being a neural network made of 86 billion neurons, human brain is a neural network made of 86 billion neural networks.
  • Our brain has far more complexities than ANNs, including many we don’t fully understand yet.

The Magic Sandwich Hole

No alt text provided for this image

  • Assume a magic hole that produces a random sandwich every few seconds
  • Sandwiches are very, very random and we have to sort them
  • We will try to automate the job
  • We are building a neural network to look at each of the sandwich and decide whether it’s good. Ignore how it recognizes the ingredients and hoe it picks up each sandwich.
  • Also if its not consumable it throws the sandwich into the recycling chute

No alt text provided for this image

  • Now, we have a bunch of inputs → single output
  • A simple black box of the algorithm would look like this

No alt text provided for this image

  • The overall expectation is something like this

No alt text provided for this image
No alt text provided for this image

  • Let’s look into the black box now
  • First a simple way of doing it would be → each ingredient getting a different weight → good ones get 1 and the ones we want to avoid will get 0. The neural network would look like this

No alt text provided for this image

  • Mud and eggshells will give 0+0 = 0

No alt text provided for this image

  • Peanut-butter-and-marshmellow 1+1=2 ? flutternutter!!!!

No alt text provided for this image

  • Simple one-layer neural network is not sophisticated enough to recognize that some ingredients, while delicious on their own, are not delicious in combination with certain others. So it will be susceptible to something we’ll call the big sandwich bug: a sandwich that contains mulch might still be rated as tasty if it contains enough good ingredients to cancel out the mulch.
  • To get a better neural network, we’re going to add another layer

No alt text provided for this image

  • The new layer is called a hidden layer, because the user only sees the inputs and outputs. This isn’t deep learning yet, it would require more layers, but we are getting there.

No alt text provided for this image

  • The new cell, let’s call it punisher. We will give this a huge negative weight and connect it with everything bad.

No alt text provided for this image

  • Deli Sandwich cell (for Chicken-and-cheese-type sandwiches bcz they are gooood)- we will add a modest weight of 1. but if we get too excited and assign it a very hight weight, we’ll be in danger of making the punisher less powerful.

No alt text provided for this image

  • Now adding marshmallow to the Deli Sandwich would make ie less tasty, we’ll need other cells that specifically look for and punish incompatibilities.
  • Let’s call it cluckerfluffer

No alt text provided for this image

  • Activation Function - without it, the cell would punish all the sandwiches that contain chicken or marshmallow. So to avoid that, we give it a threshold → Here 15 could be used as individually both have only 10 and when combined they are 20 ? Boom!. The activated cell will punish any combination of ingredients that exceeds the threshold.
  • With all the cells connected in similar sophisticated configurations, we have a neural net that can sort sandwiches.

No alt text provided for this image

The Training Process

  • The basic point of machine learning is that we don’t have to set up the neural network by hand.
  • As the neural net rates each sandwich, it needs to compare its ratings against those of a panel of cooperative sandwich judges. Note: never volunteer to test the early stages of a machine learning algorithm.
  • We will start the previous one from scratch, with random values

No alt text provided for this image

  • It hates cheese, loves marshmallow, rather fond of mud and doesn’t really care of eggshells.
  • Now our neural net has a chance to improve. From this one sandwich, it doesn’t know what the problem is. But if it looks at a batch of ten sandwiches, it can discover that if it had in general given mud a lower weight → it will match human judges a bit better.

No alt text provided for this image

  • After thousands more iterations and tens of thousands of sandwiches, the human judges are very, very sick of this, but the neural network is doing a lot better.

No alt text provided for this image

  • The neural net, as we saw before, needs more sophisticated structure with hidden layers to make accurate predictions.

No alt text provided for this image

  • Pitfall : Class Imbalance → Only a handful of every thousand sandwiches are delicious. Rather than go through all the trouble of figuring out how to weight each ingredient, the neural net may realize it can achieve 99.9% accuracy by all sandwiches as terrible, no matter what.

No alt text provided for this image

  • To combat class imbalance, pre-filter the sandwiches to have approximately equal proportions - delicious and awful. Even then, the neural net might not learn about ingredients that are usually to be avoided but delicious in very specific circumstances. Marshmallow is a great example, if it sees it very rarely, it may decide to reject anything that contains marshmallow.
  • Class imbalance related problems show up all the time in practical applications, usually when we ask AI to detect a rare event. Ex: when will someone leave from a company, detecting fraudulent logins, medical imaging, interesting celestial events like detecting a solar flare, etc. All because of not enough data for those rare events.

No alt text provided for this image


要查看或添加评论,请登录

Anurag Pola的更多文章

社区洞察

其他会员也浏览了