Week 3: The Anatomy of a Model: Input, Output, and Parameters. Breaking down what goes into a deep learning model, step-by-step.
Learning Objective
After this article, my hope is that you will have a good understanding of the building blocks of deep learning, and even understand a couple of network that can already be of use to you in your work day. (if your work day involved data science that is;).
Article Breakdown
How the Brain Works (Sort Of)
To understand how a deep learning model operates, let’s begin with a simplified analogy of how the human brain works. I’m not a neuroscientist and won’t claim to be one, but this analogy gives us a starting point. At its most basic level, the brain processes information through neurons and connections between them, called synapses. These connections can strengthen or weaken based on experience, a process known as synaptic plasticity (Definitely just googled that). This adaptability is what allows us to learn.
For example, imagine a child learning something visual. They see a big yellow box with black circles on the bottom, and their parent tells them it’s called a school bus. The visual input from their eyes is sent to the brain, where it activates pathways connected to neurons responsible for recognizing features like color, shape, and size. The brain strengthens the connection between the yellow box and the label “school bus.”
The next time the child sees a big yellow box, they exclaim "Look it's a school bus." Their parent then explains it’s actually a shipping container. The child's brain adjusts, refining the connections to include “wheels” as a critical distinguishing feature of a bus. Over time, the child becomes better at recognizing buses versus shipping containers by strengthening or weakening these neural pathways.
Visualizing the Bus Example
Let’s break this down visually. At first, the child’s brain processes the input through neurons specializing in specific tasks, such as detecting color, shape, size, and wheels. Each of these neurons sends signals based on what it perceives, and the brain uses this information to classify the object. Early on, the connections between “big,” “yellow,” and “boxy” neurons are strongly linked to the idea of a bus. As more details are learned (e.g., the importance of wheels), the connections become more nuanced.
Here’s a simplified visualization of this process:
At the beginning of the example we have two inputs, which are from the left eye and the right eye. The left and right eye send their information to the visual neurons. Let's imagine that each neuron is in charge of a specific visual task, like color, shape, size, and if something has wheels. Each of those neurons give an account of the information they are seeing, and report back whether what you are seeing is a bus or a shipping container.
During your first visual experience as a child seeing a bus, maybe the only thing you noticed were it's color and shape, so there was a strong connection between a big (size neuron) square (shape neuron) yellow (color neuron) thing, and a bus. When you again saw a big square yellow thing, your pathways were set up to classify it as a bus. Upon learning that there was such a thing as wheels and shipping containers, more connections were made to the "has-wheels" neurons.
How neural networks work
Believe it or not, if that example made sense, which I am praying it did, then you perfectly understand your first deep learning model called ... drum roll please ... a Multi-Layer Perceptron (MLP).
Just like in the brain, artificial neurons in a neural network specialize in recognizing specific patterns in the data. The network strengthens or weakens connections (weights) during training, similar to how a child refines their understanding by learning from feedback.
Artificial neural networks, such as a multilayer perceptron (MLP), are inspired by these biological principles, though they are much simpler. Instead of biological neurons and synapses, we have:
A Simplified MLP in Action
Let’s look at the structure of a single-layer MLP, which mirrors our earlier brain analogy. In an MLP:
领英推荐
Here’s a diagram of the MLP structure:
Why are they called "hidden neurons," you may ask? While in our contrived child example we assigned a specific task to each neuron, in practice we don't actually know what each of the neurons are learning and what their task is. We may hypothesize that they are learning features such as color, size, shape, etc, but their actual interpretation is "hidden" from us. Thus, we call the neurons "hidden neurons" and the column of them a "hidden layer."
In practice, we would call these single hidden layer models "shallow networks." So how do we get to the point of a "deep" network, and thus enter the realm of "deep learning?" Well imagine instead of one layer, we just start chaining them together. That may looks something like this.
How big can models get?
As the number of layers and connections increases, the number of parameters (weights and biases or the lines connecting each of the neurons to each other) grows exponentially. For example:
Business Application
Real Estate Investor
So how could you use this model in your day-to-day work? Let’s say you are a real estate investor and want to predict the price of houses in a specific market. Instead of coming up with some complicated polynomial multi-parameter regression model on excel, you could simply create an MLP that could take as inputs features such as the square footage, number of bedrooms and bathrooms, location, and year built. Each input would flow through the network, with hidden layers learning patterns like how location impacts price or how an extra bathroom might add value.
The output layer would then predict the estimated price of the house. Over time, (just as the child became better with more examples) as you feed the model more data and adjust its parameters, it would get better at making accurate predictions. This could help you make smarter investments by identifying undervalued properties or forecasting market trends.
Marketing Manager
Now let’s look at another example: customer behavior prediction for an e-commerce business. Imagine you want to predict whether a customer will buy a product after visiting your website. Inputs to the model could include:
A neural network can process these features and output a probability score for whether the customer is likely to make a purchase. With this information, you could tailor your marketing efforts—such as sending personalized offers to high-probability customers or optimizing your site for those more likely to leave without buying.
By leveraging deep learning models like the ones we’ve discussed, businesses can gain valuable insights, improve decision-making, and stay competitive in their industries. Not to mention you would be able to claim the use of proprietary 'AI' and get your investors trembling in their boots.
Wrapping Up and What’s Next
We’ve covered a lot of ground in this article! From understanding how the brain works (sort of) to exploring the structure of a neural network, you’ve learned:
But this is just the beginning. Next week, we’ll take the concepts from this article and bring them to life. We’ll dive into Python, the go-to programming language for data science, and start exploring how to:
Whether you’re a curious beginner or aspiring data scientist, next week is where you’ll get hands-on and start building something of your own. So stay tuned, and get ready to embark on your data science journey!
See you next week!