Creating a Convolutional Neural Network from Scratch
Convolution Neural Network Illustrated

Creating a Convolutional Neural Network from Scratch

Introduction

Convolution is a primitive in Computer Vision. CNN a.k.a Convolutional Neural Network is a type of Deep Learning Architecture designed to analyze and interpret image data. CNNs learn in the same way as us Humans.. We were born without knowing what a cat or a bird looks like. As we mature, we learn that certain shapes and colors correspond to certain elements that collectively correspond to an another element. Once we learn what paws and feathers look like, we’re better able to differentiate between a cat and a bird. This is how CNN learns patterns from images and then applies that knowledge for Image Detection and Classification

CNN extract features from images using a process called Convolution. It is particularly useful for tasks like image classification and object detection. CNN learns patterns within an image by sliding a filter across the pixel data to identify key features like edges, shapes, and textures.

A CNN consists of three layers:

a) an input layer

b) an output layer

c) a hidden layer that includes multiple convolutional layers.

Within the hidden layers are pooling layers, fully connected layers, and normalization layers.

The Input layer captures basic features such as edges, color, gradient orientation, and geometric shapes.

The pooling layer progressively reduces the spatial size of the representation for more efficient computation. It operates on each feature map independently.

A common approach used in pooling is max pooling, in which the maximum value of an array is captured, reducing the number of values needed for calculation. Stacking convolutional layers allows the input to be decomposed into its fundamental elements.

Normalization layers regularize the data to improve the performance and stability of neural networks. It makes the inputs of each layer more manageable by converting all inputs to a mean of zero and a variance of one.

Fully connected layers connects every neuron in one layer to all the neurons in another layer.


Real World Example

Now that we have a basic understanding of a Convolutional Neural Network, lets dive into a real world use case and understand its applications. In this example, I will use publicly available HAM1000 Dataset and create a CNN from scratch using Keras and Tensorflow in the backend. This CNN will help us classify images of Skin Cancer Cells in Humans. . You can find the code for this example in my Github Repo Here.


System Design and Flow Chart for CNN Model.


Step 1. Importing Essential Library


Import Statements


Step 2. Dataset Preprocessing and Cleansing - Here I will profile the HAM1000 Dataset and perform data quality checks on nulls and missing values. This is a necessary step.



Data Cleansing

Step 3. Exploratory Data Analysis


Cancer Cells ByType

Looking at the results, The cell type Melanecytic nevi is predominant in this dataset compared to other cell types


Distribution By Age

It seems that there are larger group of patients with age groups between 30 to 60

Distribution By Gender

It appears Male Gender has a higher distribution.


Step 4. Split Data for Training and Testing - Data Split in 80-20 ratio.


Data Split Before Normalization

Step 5. Apply Normalization - This is to ensure consistent model performance and prevent overfitting. Here we will introduce Bias. It also prevents data leakage as bias is introduced.

Normalization

Step 6. One Hot Encoding - We have 7 different classes of Skin Cancer. This step performs one hot encoding on the labels.

One Hot Encoding

Step 7. Splitting training and validation set - Now I chose 90% of this data to be used for Training our CNN and remaining 10% for Test and Validation


Training Validation Split

Step 8. Build our Model - This is where I define a CNN using Keras



Input Shape - This CNN expects each image to be 75 pixels High and 100 pixels wide and 3 represents channels RGB (Red , Blue, Green).

NumClasses represents the types of Cancer cells

The first layer in our CNN is the convolutional (Conv2D) layer. I chose 32 filters for the two conv2D layers and 64 filters for the two last ones. Each filter transforms a part of the image.CNN can isolate features that are useful everywhere from these transformed images.

The next important Layer is the pooling (MaxPool2D) layer. Combining these 2 layers, CNN is able to combine local features and learn more global features of the image.

Notice the Dropout. Dropout is a regularization technique , where a proportion of nodes in the layer are randomly ignored. This also prevents Overfitting and helps generalize.


Dropout Visualization

'relu' is the rectifier (activation function max(0,x). The rectifier activation function is used to add non linearity to the network.

The Flatten layer is use to convert the final feature maps into a one single 1D vector. This flattening step is needed so that we can make use of fully connected layers after some convolutional/maxpool layers. It combines all the found local features of the previous convolutional layers.

Step 9. Define an Optimizer

Now the CNN layers are added to the model, here I will setup an optimizer function. This function will iteratively improve parameters (filters kernel values, weights and bias of neurons) in order to minimize the loss.

Step 10. Data Augmentation to avoid Overfitting the Model

To avoid overfitting problem, we need to artificially expand HAM 10000 dataset.The idea is to alter the training data with small transformations.

Data Augmentation


Step 11. Model Training

Now with everything ready, I will train the model x_train, y_train. In this step I have chosen batch size of 10 and 50 epochs .


Step 12. Model Evaluation


Plotting Model Accuracy and Loss


Incorrect Classification

Looking at the Plot above, we see maximum incorrect predictions for Basal cell carcinoma which has code 3, then second most incorrectly classified type is Vascular lesions with code 5 followed by Melanocytic nevi code 0 where as Actinic keratoses code 4 has least incorrectly classified type.

We can also further fine-tune our model to achieve more accuracy but still this model is efficient in comparison to detection with human eyes.


Conclusion

We saw with a real world example, What is a convolutional neural network (CNN) and how to create CNN from scratch . How can we use CNN to solve an image classification problem in Healthcare.A convolutional neural network (CNN) approach can be used to implement a level 2 autonomous vehicle by mapping pixels from the camera input to the steering commands. The network automatically learns the maximum variable features from the camera input, hence requiring minimal human intervention. You can read more how can such a CNN be built from scratch that can classify CIFAR images. Dont forget to hit a Like if you liked this article.




Gyaneshwer Kumar

Data Engineer | Applied AI/ML

2 个月

Here's the link to the Dataset and the Paper https://paperswithcode.com/dataset/ham10000-1

要查看或添加评论,请登录

Gyaneshwer Kumar的更多文章

  • Fast Conformer Architecture, its Applications and rise of Conversational and Voice AI Stack

    Fast Conformer Architecture, its Applications and rise of Conversational and Voice AI Stack

    Introduction Raising a Toddler as I speak, is fun for me . Parenting a Toddler, second time and overtime, I started…

    2 条评论
  • Storage-Network-Compute Trends for AI Ready Enterprise

    Storage-Network-Compute Trends for AI Ready Enterprise

    Introduction Traditionally the Enterprise Tech Stack is comprised of Compute, Networking and Storage. Thats what the…

  • Accelerating ETL - CPU vs GPU Tradeoff

    Accelerating ETL - CPU vs GPU Tradeoff

    Introduction One of the CIO's I met at a conference, said, Data is milk fresher the better. It just resonates every…

    3 条评论
  • Reinforcement Learning and the AHA Moment

    Reinforcement Learning and the AHA Moment

    Introduction Reinforcement Learning is not a new topic but it has gained tremendous traction and momentum past couple…

    1 条评论
  • Explainable AI

    Explainable AI

    Introduction AI systems, often operate like black boxes: we send the inputs and receive outputs, but we don't see the…

  • LLM Security and Observability - Responsible AI

    LLM Security and Observability - Responsible AI

    Introduction Building Generative AI Applications and Agents is more like Software Engineering and quiet different from…

    2 条评论
  • Disaster Risk Monitoring System - Remote Sensing and Deep Learning

    Disaster Risk Monitoring System - Remote Sensing and Deep Learning

    Introduction I come from a state back in India where Floods is an yearly phenomenon taking hundreds of lives every…

    1 条评论
  • Cache Augmented Generation

    Cache Augmented Generation

    Introduction There's been a lot of buzz lately around Cache Augmented Generation. In a recent ArXiv Paper, researchers…

    1 条评论
  • Getting started with JAX

    Getting started with JAX

    Introduction JAX-ML is a cutting-edge, scientific computing framework, created by Google in 2018. It is available as a…

    1 条评论
  • Federated Learning and Differential Privacy

    Federated Learning and Differential Privacy

    Introduction Large Language Models are trained on publicly available datasets and work well in Trivial Generalist…

    2 条评论

社区洞察

其他会员也浏览了