Creating Your First Neural Network
Amit Juneja
Business & Strategy Leader | Thinker | Creator | General Manager & Global Client Executive at Wipro
Intro
In this article, we will build a simple Neural Network to provide an outline of the process of creating a Neural Network. We will use Keras/TensorFlow to implement this neural network. Additionally, we will leverage the popular MNIST dataset from Yann LeCun and Corinna Cortes, which is conveniently available through keras.datasets.
Load the dataset.
MNIST dataset has 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images
import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data(path="mnist.npz")
print(x_train.shape) #Outputs (60000, 28, 28)
print(x_test.shape) #Outputs (10000, 28, 28)
print(y_test.shape) #Outputs (10000,)
print(y_test.shape) #Outputs (10000,)
You can view any of the loaded images by using calling image show on pyplot.
import matplotlib.pyplot as plt
plt.imshow(x_train[20000], cmap='binary_r')
This function can visualize 2D array gray scale or color. Here we have a gray scale image of number 5.
Normalize the data
Normalize the train and test data between 0 - 1. Normalizing the data speeds up learning and leads to faster convergence.
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
Define the Neural Network Model
First we Flatten the image from 2D (28, 28) to 1D (28*28)
After that we have 1 hidden layer with 128 neurons
The final output layer is 10 neurons which indicate the output for 10 digits. Activation function 'softmax' has been used to give probability
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense, Flatten
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy
model = Sequential(
[Flatten(input_shape=(28, 28)),
Dense(units = 128, activation="relu"),
Dense(units = 10, activation="softmax")]
)
Compile the model
Below we use 'Adam 'as the optimizer which is a Stochastic Gradient Descent method. Default learning rate of 0.001 is retained.
model.compile(optimizer=Adam(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Train the model
We run this for 10 epochs so the weights will be updated across the neural network 10 times.
In each epoch, the loss function is calculated using batches of size 10.
model.fit(x=x_train, y=y_train, epochs=10, batch_size=10, verbose=2)
领英推荐
Predict the Output for Test Data
predictions = model.predict(x=x_test, batch_size=10, verbose=0)
Check the Output
Let us look at the test data at index 5000
plt.imshow(x_test[5000], cmap='binary_r')
It is Digit 3
Now let us check our prediction at index 5000. It displays the output array with the values of 10 neurons. We can see the higest probability out of all 10 is for digit 3 at index 3 with value of 9.9999994e-01
predictions[5000]
array([5.7262279e-13, 4.6293011e-11, 1.5742700e-09, 9.9999994e-01,
2.1653295e-13, 2.8630273e-08, 1.4112982e-13, 2.4477271e-09,
3.2406788e-08, 1.2772884e-15], dtype=float32)
predictions output currenctly has a shape of (10000, 10)
Reshape Output to Max Probability
Let us select only the max from the 10 and reduce the array to (10000,) only with each output showing the predicted value only.
import numpy as np
rounded_predictions = np.argmax(predictions, axis=-1)
You can print now the value of the index 5000 and you will see the output as number 3
rounded_predictions[5000]
#Output is 3
Performance of the Network
We can display the confusion matrix to see the performance of the classification
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
import itertools
import matplotlib.pyplot as plt
ConfusionMatrixDisplay.from_predictions(y_test, rounded_predictions, cmap="Blues")
Here is the output
he main diagonal of the matrix shows the number of correct predictions for each class, while off-diagonal elements represent instances that were misclassified.
If you look at the diagonal the Neural Network did quite well.
Outro
In conclusion, we've taken a hands-on approach to building a simple neural network using Keras/TensorFlow and exploring its functionality. We delved into the process of creating a neural network and applied it to the well-known MNIST dataset. As you've seen, creating a neural network doesn't have to be overly complex, and Keras provides a user-friendly interface for such tasks. This article serves as a starting point, and there's much more to explore in the vast field of deep learning. Whether it's experimenting with different architectures, exploring advanced features, or applying neural networks to other datasets, the journey into the world of neural networks is both exciting and limitless.
I look forward to your inputs, suggestions or any corrections.