Deep Learning

Neural Network Building Blocks: Programming View

#machinelearning #machinelearningengineer #machinelearningalgorithms #deeplearningai #deeplearning #artificialintelligence

While implementing the concepts of Deep Learning we always use some simple?mathematical concepts such as ?tensors, tensor operations,?differentiation,?gradient descent, and so on. In this post I am trying to share my understanding about these notions without getting overly technical. The most precise, rather than presenting my view by using mathematical terms, I am going to focus on executable code which I think will provide unambiguous description of a mathematical operation.

Let me start with concrete example of a neural network that uses the Python library Keras to learn to classify handwritten digits. The problem we’re trying to solve here is to classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9). I am going to use the MNIST dataset, a classic in the machine learning community. It’s a set of 60,000 training images, plus 10,000 test images, assembled by the National Institute of Standards and Technology (the NIST in MNIST) in the 1980s.

?In machine learning, a?category?in a?classification problem is called a?class. Data points?are called?samples. The class?associated with a specific sample is?called a?label.

Let us understand step by step

Step 1: Code sample to load dataset from MNSIT

from tensorflow.keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images and train_labels form the training set, the data that the model will learn from. The model will then be tested on the test set, test_images and test_labels. The images are encoded as NumPy arrays, and the labels are an array of digits, ranging from 0 to 9. The images and labels have a one-to-one correspondence.

Step 2: The Network Architecture

code Sample:

from tensorflow import keras?

from tensorflow.keras import layers

model = keras.Sequential([

??layers.Dense(512, activation="relu"),

??layers.Dense(10, activation="softmax")

])

The core building block of neural networks is the?layer. We can?think of a layer as a filter for data: some data goes in, and it comes out in a more useful form. Specifically, layers extract?representations?out of the data fed into them—hopefully, representations that are more meaningful for the problem at hand. Most of deep learning consists of chaining together simple layers that will implement a form of?progressive?data distillation. A deep learning model is like a sieve for data processing, made of a succession of increasingly refined data filters—the layers.

To make the model ready for training, we need to pick three more things as part of the compilation step:

An optimizer—The mechanism through which the model will update itself based on the training data it sees, so as to improve its performance.

A loss function—How the model will be able to measure its performance on the training data, and thus how it will be able to adjust itself in the right direction which should minimize the difference between expected and actual result.

Metrics to monitor during training and testing— How much Accuracy?

Step 3: Compilation Step

model.compile(optimizer="rmsprop",

???????loss="sparse_categorical_crossentropy",

???????metrics=["accuracy"])

Before training, we’ll preprocess the data by reshaping it into the shape the model expects and scaling it so that all values are in the?[0,?1]?interval. Previously, our training images were stored in an array of shape?(60000,?28,?28)?of type?uint8?with values in the?[0,?255]?interval. We’ll transform it into a?float32?array of shape?(60000,?28?*?28)?with values between 0 and 1.

Step 4: Image data

train_images = train_images.reshape((60000, 28 * 28))

train_images = train_images.astype("float32") / 255?

test_images = test_images.reshape((10000, 28 * 28))

test_images = test_images.astype("float32") / 255

Now we’re now ready to train the model, which in Keras is done via a call to?the model’s?fit()?method—we?fit?the model to its training data.

Step 5: Fitting the Model

>>> model.fit(train_images, train_labels, epochs=5, batch_size=128)

Step 6: Prediction on Data


test_digits = test_images[0:10]

>>> predictions = model.predict(test_digits)

>>> predictions[0]

array([1.0726176e-10, 1.6918376e-10, 6.1314843e-08, 8.4106023e-06,

????2.9967067e-11, 3.0331331e-09, 8.3651971e-14, 9.9999106e-01,

????2.6657624e-08, 3.8127661e-07], dtype=float32)


Step 6: Evaluating Model

>>> test_loss, test_acc = model.evaluate(test_images, test_labels)


>>> print(f"test_acc: {test_acc}")

test_acc: 0.9785

The test-set accuracy turns out to be 97.8%—that’s quite a bit lower than the training-set accuracy (98.9%). This gap between training accuracy and test accuracy is an example of overfitting

....... to be continued .........................

要查看或添加评论,请登录

Rakesh Ranjan的更多文章

  • The top five challenges a semantic layer can solve

    The top five challenges a semantic layer can solve

    In recent past the term semantic layer is frequently pop-up in data-driven AI talks and articles. It has been invented…

    1 条评论
  • Holiday Reading

    Holiday Reading

  • Amazon GuardDuty : An Intelligent Threat Detection Service

    Amazon GuardDuty : An Intelligent Threat Detection Service

    October Cyber Security Awareness Month continuation to my previous post on understanding security services available on…

    2 条评论
  • Empowering with Actionable Tips and Emerging Trends

    Empowering with Actionable Tips and Emerging Trends

    I have started awareness campaign yesterday by this post Few of my friends appreciated it and suggested to consider it…

    4 条评论
  • October: Cyber Security Awareness Month

    October: Cyber Security Awareness Month

    October is being celebrated as #CyberSecurityAwareness Month. As being practitioner in the domain of #CyberSecurity and…

  • Rise in Automotive Hacking

    Rise in Automotive Hacking

    With advancement in technologies and adaption of AI introduces new challenges in cybersecurity trends as news of data…

    1 条评论
  • Software Supply Chain Security

    Software Supply Chain Security

    Enhancing software supply chain security is a priority issue for the open-source community. Recent exploitations such…

    1 条评论
  • CIO priority: Software Supply chain Security

    CIO priority: Software Supply chain Security

    Gartner predicts that by 2025, 45% of organizations worldwide will have experienced attacks on their software supply…

    2 条评论
  • Guide for building Effective Team

    Guide for building Effective Team

    As being software architect, my main responsibilities lie in creating technical architecture and making effective…

  • Enterprise Integration | solution approach |Mule

    Enterprise Integration | solution approach |Mule

    In modern IT landscape when the focus of #enterprisearchitect is there on #digitalbusinesstransformation, it always…

社区洞察

其他会员也浏览了