登录查看更多内容

Creating Your First Neural Network

Amit Juneja

Business & Strategy Leader | Thinker | Creator | General Manager & Global Client Executive at Wipro

发布日期: 2023年11月25日

Intro

In this article, we will build a simple Neural Network to provide an outline of the process of creating a Neural Network. We will use Keras/TensorFlow to implement this neural network. Additionally, we will leverage the popular MNIST dataset from Yann LeCun and Corinna Cortes, which is conveniently available through keras.datasets.

Load the dataset.

MNIST dataset has 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images

import tensorflow as tf

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data(path="mnist.npz")

print(x_train.shape) #Outputs (60000, 28, 28)
print(x_test.shape) #Outputs (10000, 28, 28)
print(y_test.shape) #Outputs (10000,)
print(y_test.shape) #Outputs (10000,)

You can view any of the loaded images by using calling image show on pyplot.

import matplotlib.pyplot as plt
plt.imshow(x_train[20000], cmap='binary_r')

This function can visualize 2D array gray scale or color. Here we have a gray scale image of number 5.

Normalize the data

Normalize the train and test data between 0 - 1. Normalizing the data speeds up learning and leads to faster convergence.

x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)

Define the Neural Network Model

First we Flatten the image from 2D (28, 28) to 1D (28*28)

After that we have 1 hidden layer with 128 neurons

The final output layer is 10 neurons which indicate the output for 10 digits. Activation function 'softmax' has been used to give probability

from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense, Flatten
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy

model = Sequential(
    [Flatten(input_shape=(28, 28)),
     Dense(units = 128, activation="relu"),
     Dense(units = 10, activation="softmax")]
)

Compile the model

Below we use 'Adam 'as the optimizer which is a Stochastic Gradient Descent method. Default learning rate of 0.001 is retained.

model.compile(optimizer=Adam(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

Train the model

We run this for 10 epochs so the weights will be updated across the neural network 10 times.

In each epoch, the loss function is calculated using batches of size 10.

model.fit(x=x_train, y=y_train, epochs=10, batch_size=10, verbose=2)

领英推荐

Unleashing MobileNetV2: Efficient CNN Insights

Machine Learning Reply GmbH 1 年前

A Guide into Activation Functions in Neural Networks

Diego Bonilla Salvador 10 个月前

Artificial Neural Network (ANN): Learning by Training

Boolean Algorithmic Trading 1 年前

Predict the Output for Test Data

predictions = model.predict(x=x_test, batch_size=10, verbose=0)

Check the Output

Let us look at the test data at index 5000

plt.imshow(x_test[5000], cmap='binary_r')

It is Digit 3

Now let us check our prediction at index 5000. It displays the output array with the values of 10 neurons. We can see the higest probability out of all 10 is for digit 3 at index 3 with value of 9.9999994e-01

predictions[5000]

array([5.7262279e-13, 4.6293011e-11, 1.5742700e-09, 9.9999994e-01,
       2.1653295e-13, 2.8630273e-08, 1.4112982e-13, 2.4477271e-09,
       3.2406788e-08, 1.2772884e-15], dtype=float32)

predictions output currenctly has a shape of (10000, 10)

Reshape Output to Max Probability

Let us select only the max from the 10 and reduce the array to (10000,) only with each output showing the predicted value only.

import numpy as np
rounded_predictions = np.argmax(predictions, axis=-1)

You can print now the value of the index 5000 and you will see the output as number 3

rounded_predictions[5000]

#Output is 3

Performance of the Network

We can display the confusion matrix to see the performance of the classification

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import itertools
import matplotlib.pyplot as plt
ConfusionMatrixDisplay.from_predictions(y_test, rounded_predictions, cmap="Blues")

Here is the output

he main diagonal of the matrix shows the number of correct predictions for each class, while off-diagonal elements represent instances that were misclassified.

If you look at the diagonal the Neural Network did quite well.

Outro

In conclusion, we've taken a hands-on approach to building a simple neural network using Keras/TensorFlow and exploring its functionality. We delved into the process of creating a neural network and applied it to the well-known MNIST dataset. As you've seen, creating a neural network doesn't have to be overly complex, and Keras provides a user-friendly interface for such tasks. This article serves as a starting point, and there's much more to explore in the vast field of deep learning. Whether it's experimenting with different architectures, exploring advanced features, or applying neural networks to other datasets, the journey into the world of neural networks is both exciting and limitless.

I look forward to your inputs, suggestions or any corrections.

要查看或添加评论，请登录

Amit Juneja的更多文章

Brief of Serverless Keynote by Peter DeSantis @ AWS re:Invent 2023

2023年11月30日

Brief of Serverless Keynote by Peter DeSantis @ AWS re:Invent 2023

What is the promise of Serverless? Serverless removes the muck of caring for servers. What are the key attributes of a…
Introduction to Neural Networks

2023年11月20日

Introduction to Neural Networks

Intro In this article, my attempt is to summarize the intuition behind the neural networks for a higher level…

8 条评论
Generative AI: Disruption in Content Creation

2022年11月28日

Generative AI: Disruption in Content Creation

Generative AI is a disruptive technology in the field of content generation. So far knowledge and creative work has…

1 条评论
Talking to Strangers - Malcolm Gladwell

2021年3月22日

Talking to Strangers - Malcolm Gladwell

I had the pleasure of reading the book “Talking to Strangers” by Malcolm Gladwell over the spring break. I always found…

6 条评论

Creating Your First Neural Network

Amit Juneja

Business & Strategy Leader | Thinker | Creator | General Manager & Global Client Executive at Wipro

Intro

Load the dataset.

Normalize the data

Define the Neural Network Model

Compile the model

Train the model

领英推荐

Predict the Output for Test Data

Check the Output

Reshape Output to Max Probability

Performance of the Network

Outro

Amit Juneja的更多文章

社区洞察

其他会员也浏览了

Dissecting Backpropagation in Neural Networks

BxD Primer Series: Convolutional Neural Networks

How to create a Neural Network

Deep neural networks as a composite function and the chain rule

Neural networks

Convolutional Neural Networks (CNNs)

Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation

Comparative Analysis: ARIMA's Box-Jenkins Approach vs. LSTM's Neural Network Structure in Time Series Forecasting

Keeping Neural Networks Simple by Minimizing the Description Length of the Weights

BxD Primer Series: Markov Chain Neural Networks

Intro

Load the dataset.

Normalize the data

Define the Neural Network Model

Compile the model

Train the model

领英推荐

Predict the Output for Test Data

Check the Output

Reshape Output to Max Probability

Performance of the Network

Outro

Amit Juneja的更多文章

Brief of Serverless Keynote by Peter DeSantis @ AWS re:Invent 2023

Introduction to Neural Networks

Generative AI: Disruption in Content Creation

Talking to Strangers - Malcolm Gladwell

社区洞察

其他会员也浏览了

Dissecting Backpropagation in Neural Networks

BxD Primer Series: Convolutional Neural Networks

How to create a Neural Network

Deep neural networks as a composite function and the chain rule

Neural networks

Convolutional Neural Networks (CNNs)

Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation

Comparative Analysis: ARIMA's Box-Jenkins Approach vs. LSTM's Neural Network Structure in Time Series Forecasting

Keeping Neural Networks Simple by Minimizing the Description Length of the Weights

BxD Primer Series: Markov Chain Neural Networks