Teaching Machines to Read Our Scribbles: A Journey Through Machine Learning and Neural Networks

Teaching Machines to Read Our Scribbles: A Journey Through Machine Learning and Neural Networks

We've all been there - writing a quick note or scribbling down a number in a hurry. It's easy for us to read our own writing (well, most of the time!). But how does a machine, like a computer or a smartphone, make sense of it? Did you ever stop to think how machines manage to decipher our varied, and sometimes untidy, penmanship? If not, then you've already encountered the incredible world of handwritten digit recognition. Let's delve into the fascinating interplay of binary and multiclass classification that brings our numbers to life in the digital realm.

In this case study, we'll embark on a fascinating journey, starting with the basics. We'll explore how a neural network is trained to recognize just two handwritten digits: '0' and '1'. This task, known as binary classification, is where our adventure begins.

But we won't stop there.

Once we unravel the mystery behind recognizing these two digits, we'll scale up to a more complex and intriguing challenge: recognizing all 10 digits (0-9). This next step, known as multiclass classification, opens up a whole new world of understanding Machine Learning and role of Neural networks.

The Challenge: Utilizing neural network methods to recognize handwritten digits.

Data Insight:

  • We're working with grayscale images of handwritten digits, specifically the numbers 0 and 1. Each image is a 20x20 pixel grayscale snapshot. The intensity of the grayscale is represented using floating-point numbers.
  • To simplify our data handling, we've unrolled these 20x20 grids into 400-dimensional vectors. This means that our data matrix holds each image as a single row, culminating in a sizeable 1000x400 matrix. This data is a subset curated from the renowned MNIST handwritten digit dataset, which is a comprehensive collection of digits that has greatly contributed to many handwriting recognition tasks in the machine learning community. For those interested, the complete dataset can be explored further on Yann LeCun's website.

Let's dive into the visualization:

No alt text provided for this image
The image shows a visualization of handwritten images, specifically the numbers 0 and 1.

We loaded the dataset containing images of these digits and their corresponding labels. Then, it randomly selects 64 images, reshapes them to their original 20x20 pixel size, and arranges them in an 8x8 grid for display. The resulting visual shows the variety in how people write the numbers 0 and 1. It's a neat way to see how machine learning can interact with something as personal and unique as handwriting!

Why we need Neural Network in this case:

For handwriting recognition, neural networks act as a complex pattern recognizer. The intricate and unique style of human handwriting means that traditional rule-based approaches might fail. Neural networks can model these complex patterns and make sense of them, recognizing various styles of writing and differentiating between different numbers.

A data scientist's role is essential in weaving these complex decisions together to create a model that can recognize handwritten digits accurately and effectively.

Here we will deploy and test performances for 3 prominent activation functions for NN: ReLU, Sigmoid and Linear

No alt text provided for this image

the code snippet defines a simple neural network architecture using TensorFlow Keras. This model has three layers:

A dense (fully connected) layer with 128 neurons and the provided activation function.

2nd dense layer with 64 neurons and the provided activation function.

A final 3rd dense layer with 1 neuron and the 'sigmoid' activation function to output a probability that the image is a 1 (or 0).

The model is compiled using the Adam optimizer, binary cross-entropy loss (since this is a binary classification task), and it will track accuracy as a metric.

Finally, the model is trained on the training data for 50 epochs and also evaluates its performance on the test data at the end of each epoch.


No alt text provided for this image

These three models are like the "reaction" a neuron gives to the information it receives.

  1. ReLU: If the neuron receives a positive signal, it reacts exactly as much as the signal. If it's negative, it doesn't react at all.
  2. Linear: The neuron's reaction is directly proportional to the signal it receives, whether positive or negative.
  3. Sigmoid: The neuron's reaction is more nuanced, ranging between 0 and 1, smoothly increasing as the signal gets stronger.

Now, why is the model struggling with Sigmoid in recognizing handwritten images?

  • ReLU and Linear React More Freely: They can react strongly to positive signals, making them responsive and adaptive to patterns in handwriting.
  • Sigmoid Can Get Stuck: Its smooth and limited reaction might cause it to get "stuck" during learning, reacting weakly even when it should react strongly.

So when it comes to recognizing letters from squiggly handwriting. ReLU and Linear are like detectives that can jump on strong clues, while Sigmoid is more cautious and might miss those clues, thus performing less accurately.

These activation functions are like tuning how our model thinks, and in this case, ReLU and Linear seem to think in a way better suited for recognizing handwriting.

Next stage: Multi-Layer Perceptron (MLP)

Let's elevate our game! We're constructing a mighty Multi-Layer Perceptron (MLP) to decode the intricate art of handwritten digits, ranging from 0 all the way to 9.

By designing this sophisticated neural architecture, we're essentially crafting a digital maestro, adept at distinguishing the nuanced curves and strokes of every single handwritten number. So, whether it's a hastily scribbled '3' or a meticulously crafted '9', our neural maestro is on the task, ready to recognize and classify!

No alt text provided for this image

Here our final activation function is softmax, because for multiple class neural network for classification tasks, it is the most useful function.

We're again using the Adam optimizer, with the curated loss function categorical_crossentropy which measures how well the predicted probabilities match the actual labels. Our aim during training is to maximize the accuracy!

No alt text provided for this image

After rigorously training for 120 epochs, our MLP model demonstrates robust performance. The accuracy on both training and validation datasets converges at an impressive approximate rate of 95%. This showcases the model's capability to reliably recognize a wide array of handwriting styles and nuances.

Let's take a closer look at some images that even our meticulously trained model misclassified. To be fair, some of these handwritten digits are so intricate that they might stump even the keenest human eye. It's a gentle reminder that while our neural network is powerful, deciphering tricky handwriting remains a challenging task for any entity.

No alt text provided for this image

Truth be told, a few of these handwritten samples are so intricate they'd leave many of us scratching our heads. While our neural network has achieved a commendable feat, it highlights the intriguing challenge that intricate handwriting poses, even in the age of AI.

In conclusion, the realm of neural networks and machine learning stands as a testament to the advancements we've made in technology. Yet, the nuances of human behavior, like our unique handwriting styles, remind us of the delightful intricacies that make us human. As data scientists, every misclassified image isn't just an error; it's an opportunity, a puzzle waiting to be solved. The journey in machine learning isn't just about perfection; it's about the chase, the learning, and the endless possibilities that lie ahead.

Connect with me on LinkedIn as we continue to traverse this captivating world of AI together. Tazkera Haque | LinkedIn





Charles Ojukwu

Data Analyst | Power BI Consultant | Transforming Data into Strategic Business Decisions | SQL | Excel

1 年

Great project Tazkera! Lots to learn from you ??

Janani Teklur Srinivasa

Data-Driven Mainframe developer| Data Analyst

1 年

You always come up with interesting projects! Great job on this one as well Tazkera!

Julie S.

Data Analyst || Lab informatics || MS 2023

1 年

This is such a cool topic. Thanks for sharing this.

Charlie Perkins

Board Chair at United Way of Southeast MN | Fostering Lasting Community Impact | Passionate advocate | Career Coach | Writer | Servant Leader | Accredited Media Manager | Photographer/Videographer

1 年

Awesome!

Alexander Davis, Ph.D.

Computational Chemist | Data Scientist | Artificial Intelligence | Multidisciplinary Collaborator | Machine Learning | Mentor | Outdoor Enthusiast

1 年

Great work and write-up, very interesting!

要查看或添加评论,请登录

Tazkera Sharifi的更多文章

社区洞察

其他会员也浏览了