登录查看更多内容

Teaching Machines to Read Our Scribbles: A Journey Through Machine Learning and Neural Networks

Tazkera Sharifi

AI/ML Engineer @ Booz Allen Hamilton | LLM | Generative AI | Deep Learning | AWS certified | Snowflake Builder DevOps | DataBricks| Innovation | Astrophysicist | Travel

发布日期: 2023年8月18日

We've all been there - writing a quick note or scribbling down a number in a hurry. It's easy for us to read our own writing (well, most of the time!). But how does a machine, like a computer or a smartphone, make sense of it? Did you ever stop to think how machines manage to decipher our varied, and sometimes untidy, penmanship? If not, then you've already encountered the incredible world of handwritten digit recognition. Let's delve into the fascinating interplay of binary and multiclass classification that brings our numbers to life in the digital realm.

In this case study, we'll embark on a fascinating journey, starting with the basics. We'll explore how a neural network is trained to recognize just two handwritten digits: '0' and '1'. This task, known as binary classification, is where our adventure begins.

But we won't stop there.

Once we unravel the mystery behind recognizing these two digits, we'll scale up to a more complex and intriguing challenge: recognizing all 10 digits (0-9). This next step, known as multiclass classification, opens up a whole new world of understanding Machine Learning and role of Neural networks.

The Challenge: Utilizing neural network methods to recognize handwritten digits.

Data Insight:

We're working with grayscale images of handwritten digits, specifically the numbers 0 and 1. Each image is a 20x20 pixel grayscale snapshot. The intensity of the grayscale is represented using floating-point numbers.
To simplify our data handling, we've unrolled these 20x20 grids into 400-dimensional vectors. This means that our data matrix holds each image as a single row, culminating in a sizeable 1000x400 matrix. This data is a subset curated from the renowned MNIST handwritten digit dataset, which is a comprehensive collection of digits that has greatly contributed to many handwriting recognition tasks in the machine learning community. For those interested, the complete dataset can be explored further on Yann LeCun's website.

Let's dive into the visualization:

No alt text provided for this image — The image shows a visualization of handwritten images, specifically the numbers 0 and 1.

We loaded the dataset containing images of these digits and their corresponding labels. Then, it randomly selects 64 images, reshapes them to their original 20x20 pixel size, and arranges them in an 8x8 grid for display. The resulting visual shows the variety in how people write the numbers 0 and 1. It's a neat way to see how machine learning can interact with something as personal and unique as handwriting!

Why we need Neural Network in this case:

For handwriting recognition, neural networks act as a complex pattern recognizer. The intricate and unique style of human handwriting means that traditional rule-based approaches might fail. Neural networks can model these complex patterns and make sense of them, recognizing various styles of writing and differentiating between different numbers.

A data scientist's role is essential in weaving these complex decisions together to create a model that can recognize handwritten digits accurately and effectively.

Here we will deploy and test performances for 3 prominent activation functions for NN: ReLU, Sigmoid and Linear

the code snippet defines a simple neural network architecture using TensorFlow Keras. This model has three layers:

A dense (fully connected) layer with 128 neurons and the provided activation function.

2nd dense layer with 64 neurons and the provided activation function.

A final 3rd dense layer with 1 neuron and the 'sigmoid' activation function to output a probability that the image is a 1 (or 0).

The model is compiled using the Adam optimizer, binary cross-entropy loss (since this is a binary classification task), and it will track accuracy as a metric.

Finally, the model is trained on the training data for 50 epochs and also evaluates its performance on the test data at the end of each epoch.

领英推荐

Neural Network Chain Rule: Understanding the…

Doug Rose 9 个月前

Week 8: Deep Dive into Deep Learning and Neural…

Alaaeddin Alweish 7 个月前

Significance of non linearity in machine learning and…

Ajit Jaokar 9 个月前

These three models are like the "reaction" a neuron gives to the information it receives.

ReLU: If the neuron receives a positive signal, it reacts exactly as much as the signal. If it's negative, it doesn't react at all.
Linear: The neuron's reaction is directly proportional to the signal it receives, whether positive or negative.
Sigmoid: The neuron's reaction is more nuanced, ranging between 0 and 1, smoothly increasing as the signal gets stronger.

Now, why is the model struggling with Sigmoid in recognizing handwritten images?

ReLU and Linear React More Freely: They can react strongly to positive signals, making them responsive and adaptive to patterns in handwriting.
Sigmoid Can Get Stuck: Its smooth and limited reaction might cause it to get "stuck" during learning, reacting weakly even when it should react strongly.

So when it comes to recognizing letters from squiggly handwriting. ReLU and Linear are like detectives that can jump on strong clues, while Sigmoid is more cautious and might miss those clues, thus performing less accurately.

These activation functions are like tuning how our model thinks, and in this case, ReLU and Linear seem to think in a way better suited for recognizing handwriting.

Next stage: Multi-Layer Perceptron (MLP)

Let's elevate our game! We're constructing a mighty Multi-Layer Perceptron (MLP) to decode the intricate art of handwritten digits, ranging from 0 all the way to 9.

By designing this sophisticated neural architecture, we're essentially crafting a digital maestro, adept at distinguishing the nuanced curves and strokes of every single handwritten number. So, whether it's a hastily scribbled '3' or a meticulously crafted '9', our neural maestro is on the task, ready to recognize and classify!

Here our final activation function is softmax, because for multiple class neural network for classification tasks, it is the most useful function.

We're again using the Adam optimizer, with the curated loss function categorical_crossentropy which measures how well the predicted probabilities match the actual labels. Our aim during training is to maximize the accuracy!

After rigorously training for 120 epochs, our MLP model demonstrates robust performance. The accuracy on both training and validation datasets converges at an impressive approximate rate of 95%. This showcases the model's capability to reliably recognize a wide array of handwriting styles and nuances.

Let's take a closer look at some images that even our meticulously trained model misclassified. To be fair, some of these handwritten digits are so intricate that they might stump even the keenest human eye. It's a gentle reminder that while our neural network is powerful, deciphering tricky handwriting remains a challenging task for any entity.

Truth be told, a few of these handwritten samples are so intricate they'd leave many of us scratching our heads. While our neural network has achieved a commendable feat, it highlights the intriguing challenge that intricate handwriting poses, even in the age of AI.

In conclusion, the realm of neural networks and machine learning stands as a testament to the advancements we've made in technology. Yet, the nuances of human behavior, like our unique handwriting styles, remind us of the delightful intricacies that make us human. As data scientists, every misclassified image isn't just an error; it's an opportunity, a puzzle waiting to be solved. The journey in machine learning isn't just about perfection; it's about the chase, the learning, and the endless possibilities that lie ahead.

Connect with me on LinkedIn as we continue to traverse this captivating world of AI together. Tazkera Haque | LinkedIn

带有此图标的链接由领英创建，不带此图标的链接由作者添加。

Charles Ojukwu

Data Analyst | Power BI Consultant | Transforming Data into Strategic Business Decisions | SQL | Excel

1 年

Great project Tazkera! Lots to learn from you ??

2 次回应

Janani Teklur Srinivasa

Data-Driven Mainframe developer| Data Analyst

1 年

You always come up with interesting projects! Great job on this one as well Tazkera!

2 次回应

Julie S.

Data Analyst || Lab informatics || MS 2023

1 年

This is such a cool topic. Thanks for sharing this.

2 次回应

Charlie Perkins

1 年

Awesome!

2 次回应

Alexander Davis, Ph.D.

1 年

Great work and write-up, very interesting!

2 次回应

查看更多评论

要查看或添加评论，请登录

Tazkera Sharifi的更多文章

Enhancing Business Engagement: Advanced AI and LLM for Detoxifying and Moderating Hate Speech in Online Communities

2024年4月23日

Enhancing Business Engagement: Advanced AI and LLM for Detoxifying and Moderating Hate Speech in Online Communities

The Imperative for Advanced Content Moderation In our role as digital strategist, We have had the opportunity to deeply…

11 条评论
Advanced Technologies for Enhanced Time Series Forecasting : Apache Spark and Prophet

2023年12月20日

Advanced Technologies for Enhanced Time Series Forecasting : Apache Spark and Prophet

Recent advancements in time series forecasting are revolutionizing how retailers manage their inventories, enabling…

2 条评论
From Data Engineering to Deployment: Mastering End-to-End Classification Models with AWS SageMaker

2023年11月24日

From Data Engineering to Deployment: Mastering End-to-End Classification Models with AWS SageMaker

Introduction The application of machine learning isn't solely the realm of data scientists; it's an interdisciplinary…

2 条评论
Building and Fine Tuning a Large Language Model with Generative AI: A DeepLearning AI Case Study

2023年11月16日

Building and Fine Tuning a Large Language Model with Generative AI: A DeepLearning AI Case Study

Introduction The ability to interpret and generate human-like text has emerged as a game-changer in our current…

4 条评论
Predicting the Unpredictable: A Data-Driven Approach to Arresting Customer Churn in Banking

2023年10月28日

Predicting the Unpredictable: A Data-Driven Approach to Arresting Customer Churn in Banking

The banking industry is going through a seismic shift, characterized by changing customer expectations and an…

9 条评论
The Next Frontier in Text Summarization: Fine-tuning Large Language Models using Falcon-40b with QLoRA on Amazon SageMaker

2023年10月20日

The Next Frontier in Text Summarization: Fine-tuning Large Language Models using Falcon-40b with QLoRA on Amazon SageMaker

Professionals across the board face a common dilemma: How can one efficiently summarize massive sets of dialogue or…

9 条评论
Revolutionizing Medical Diagnosis: A Cutting-Edge AI Chest X-ray Classifier for the Future of Healthcare

2023年10月16日

Revolutionizing Medical Diagnosis: A Cutting-Edge AI Chest X-ray Classifier for the Future of Healthcare

Introduction In the advanced landscape of medical technology, artificial intelligence (AI) has emerged as a…

9 条评论
Tackling Imbalanced Data for Improved Churn Prediction with Snowflake and Hex

2023年10月3日

Tackling Imbalanced Data for Improved Churn Prediction with Snowflake and Hex

In an age where customer preferences shift at the speed of light, keeping them engaged and committed to your brand is…

4 条评论
A Comprehensive AWS Cloud-based Case Study: Transforming Women's Clothing Reviews into Data Science Gold

2023年9月15日

A Comprehensive AWS Cloud-based Case Study: Transforming Women's Clothing Reviews into Data Science Gold

Introduction: In our online shopping digital age where reviews are often the main deciding factor for online consumers,…

14 条评论
Detecting Anomalies in Server Behavior Using Gaussian Models: Unsupervised Learning for Infrastructure Monitoring

2023年9月9日

Detecting Anomalies in Server Behavior Using Gaussian Models: Unsupervised Learning for Infrastructure Monitoring

Introduction: In modern hyper-connected world, server reliability is the backbone of any successful digital operation…

13 条评论

See all articles

Teaching Machines to Read Our Scribbles: A Journey Through Machine Learning and Neural Networks

Tazkera Sharifi

AI/ML Engineer @ Booz Allen Hamilton | LLM | Generative AI | Deep Learning | AWS certified | Snowflake Builder DevOps | DataBricks| Innovation | Astrophysicist | Travel

Data Insight:

领英推荐

Tazkera Sharifi的更多文章

社区洞察

其他会员也浏览了

Table Parsing Made Simple with Homegrown Neural Networks - Part 3: Building a Neural Network with Semantic & Positional Features

Leveraging Transfer Learning for Computer Vision

Table Parsing Made Simple with Homegrown Neural Networks - Part 4: Training Pipeline Coding Insights

Transition from Traditional Machine Learning to Neural Networks and Deep Learning

Top 10 Activation Functions in Deep Learning

Neural Networks vs Deep Learning - Understanding the Difference

What Are The Mechanics Of AI

AI Atlas #17: Recurrent Neural Networks (RNNs)

A Comprehensive Overview of Deep Learning

Neural Networks Made Fun With TensorFlow Playground!

Data Insight:

领英推荐

Tazkera Sharifi的更多文章

Enhancing Business Engagement: Advanced AI and LLM for Detoxifying and Moderating Hate Speech in Online Communities

Advanced Technologies for Enhanced Time Series Forecasting : Apache Spark and Prophet

From Data Engineering to Deployment: Mastering End-to-End Classification Models with AWS SageMaker

Building and Fine Tuning a Large Language Model with Generative AI: A DeepLearning AI Case Study

Predicting the Unpredictable: A Data-Driven Approach to Arresting Customer Churn in Banking

The Next Frontier in Text Summarization: Fine-tuning Large Language Models using Falcon-40b with QLoRA on Amazon SageMaker

Revolutionizing Medical Diagnosis: A Cutting-Edge AI Chest X-ray Classifier for the Future of Healthcare

Tackling Imbalanced Data for Improved Churn Prediction with Snowflake and Hex

A Comprehensive AWS Cloud-based Case Study: Transforming Women's Clothing Reviews into Data Science Gold

Detecting Anomalies in Server Behavior Using Gaussian Models: Unsupervised Learning for Infrastructure Monitoring

社区洞察

其他会员也浏览了

Table Parsing Made Simple with Homegrown Neural Networks - Part 3: Building a Neural Network with Semantic & Positional Features

Leveraging Transfer Learning for Computer Vision

Table Parsing Made Simple with Homegrown Neural Networks - Part 4: Training Pipeline Coding Insights

Transition from Traditional Machine Learning to Neural Networks and Deep Learning

Top 10 Activation Functions in Deep Learning

Neural Networks vs Deep Learning - Understanding the Difference

What Are The Mechanics Of AI

AI Atlas #17: Recurrent Neural Networks (RNNs)

A Comprehensive Overview of Deep Learning

Neural Networks Made Fun With TensorFlow Playground!