登录查看更多内容

Demystify Neural Network. Implement A Simple NN using PYTHON Functions. Tensorflow/keras VS Python-Code...??

RANJIT PANDA

AI/ML Practitioner

发布日期: 2024年6月29日

+ 关注

Steps:

We will implement NN using Tensorflow first.
Post that will implement NN using simple code and math.

A neural network consists of layers, each containing numerous neurons that collectively work to solve specific tasks. Its primary function is to iteratively adjust weights until it converges to the global minimum, minimizing loss.
To backtrack using gradient descent and find optimal weights and biases, the network computes gradients of the loss function with respect to each weight and bias in reverse order, starting from the output layer to the input layer. These gradients indicate the direction of steepest descent in the multidimensional weight space. By iteratively adjusting weights and biases in the opposite direction of these gradients, the network aims to minimize the loss function gradually. This iterative process continues until convergence, where the weights and biases ideally reach optimal values that minimize the overall loss across the training dataset. Thus, through systematic backtracking via gradient descent, the neural network refines its parameters to efficiently solve complex problems

Problem We will be working on is a binary classification problem in which we will be predicting whether a person going to opt for insurance or not depending on his/her age and affordability.

Lets start...

Importing required packages

import numpy as np
import tensorflow as tf
from tensorflow import keras
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline

Loading dataset...

df = pd.read_csv("insurance_data.csv")
df.head()

Dataset has three columns age(10-90) , affordability(0/1), insurance(0/1).

split the dataset :

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df[['age','affordibility']],df.bought_insurance,test_size=0.2, random_state=25)

Here age data range is (10-90) ,another feature range is (0/1), lets scale the age feature to increase our model performance.

X_train_scaled = X_train.copy()
X_train_scaled['age'] = X_train_scaled['age'] / 100

X_test_scaled = X_test.copy()
X_test_scaled['age'] = X_test_scaled['age'] / 100

Network architecture: here we will have one neuron consisting of weighted sum function and a sigmoid function.

defining NN and compiling...


model = keras.Sequential([
    keras.layers.Dense(1, input_shape=(2,), activation='sigmoid', kernel_initializer='ones', bias_initializer='zeros')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(X_train_scaled, y_train, epochs=5000)

Post training lets see final weights/kernels and bias

model.evaluate(X_test_scaled,y_test)
1/1 [==============================] - 0s 1ms/step - loss: 0.3550 - accuracy: 1.0000
[0.35497748851776123, 1.0]
#lets predict
model.predict(X_test_scaled)
array([[0.7054848 ],
       [0.35569546],
       [0.16827849],
       [0.47801173],
       [0.7260697 ],
       [0.8294984 ]], dtype=float32)
#lets compare prediction with actual values
y_test
2     1
10    0
21    0
11    0
14    1
9     1
Name: bought_insurance, dtype: int64

#Now get the value of weights and bias from the model

coef, intercept = model.get_weights()
coef, intercept
(array([[5.060867 ],
        [1.4086502]], dtype=float32),
 array([-2.9137027], dtype=float32))

Here are the kernels [5.060867 ], [1.4086502] for respective features and [-2.91] is the bias. This means w1=5.060867, w2=1.4086502, bias =-2.9137027

Now lets mimic using simple python code and math.

lets build the loss and activation function. here we gonna use sigmoid and log loss functions.

above image describing aggregated error for one epoch and log loss function.

领英推荐

Roll Up Your Sleeves: 9 Data and Machine Learning…

Towards Data Science 10 个月前

#50 Why Do Neural Networks Hallucinate?

Towards AI 4 个月前

My new book on Language Models is here

Andriy Burkov 2 个月前

def sigmoid_numpy(X):
   return 1/(1+np.exp(-X))

def log_loss(y_true, y_predicted):
    epsilon = 1e-15
    y_predicted_new = [max(i,epsilon) for i in y_predicted]
    y_predicted_new = [min(i,1-epsilon) for i in y_predicted_new]
    y_predicted_new = np.array(y_predicted_new)
    return -np.mean(y_true*np.log(y_predicted_new)+(1-y_true)*np.log(1-y_predicted_new))

Here we will be using batch gradient decent methodology in which in a single epoch we gonna process entire dataset. Now lets build the gradient decent function including the flow.

image above describing for one record we are finding predicted value and then error1. like so will do for all records and then will find aggregated error and that will be one epoch. after all epoch will calculate mean of error and that will be final error.
First we gonna initialize the weights with 1 and bias with zero. then we will apply sigmoid function followed by weighted sum for all age and affordability values and will find predicted values and then we gonna apply log loss function to find the error . Then we gonna new weights by manipulating existing weights and bias and to do this we gonna use derivative function. So we will find derivative of error with respect to weight1(w1) i.e. how much change in error with given change in w1. similarly for w2 and bias will do the same.

Above image representing we got weights and bias after first epoch and then after we gonna back track and through gradient decent how we gonna reduce error and reach to that global minima.

Above image describing how we are going to optimize the weights and bias . Given is the formulae. here we are using derivative of error respect to weights or bias. To know more about derivative you can visit mathisfun.

Above image showing the formula for derivative function for calculating new weights and bias.

Above image defining after applying the formulae what is the next set of weights and bias we got and with those we gonna proceed further to second epoch. And this will continue until we reduce the loss to very minimal.

Above image explaining post calculation of new weights and bias , in second epoch we are applying and calculating error .
The loop continues like in an epoch we are applying weights and bias and calculating weighted sum and then applying sigmoid function and finding prediction and then calculating error using log loss function, and for each epoch we are aggregating error by mean and then forwarding to next epoch followed by calculating new set of weights and bias by applying derivative function. This process will continue until we reach where no epoch is left . The goal is to reach to the global minima where error or loss will be minimal through this gradient decent approach . However finding optimal epoch is just trial and error kind of thing for which we can do MLOPS.

Above is the image showing that we have reached to global minima.

Above image is graph showing for using derivative to express the non-linear line. How we are going from a high loss to very less cost by applying derivative function of error with respect to bias followed by arithmetic operation with learning rate and existing bias. Here to find the slope/intercept linear function can't help as the slope decreases/increase after every iteration hence perfect way is by using non-linear functions such as derivatives.
Now lets define our gradient decent method:

def gradient_descent(age, affordability, y_true, epochs, loss_thresold):
    w1 = w2 = 1
    bias = 0
    rate = 0.5
    n = len(age)
    for i in range(epochs):
        weighted_sum = w1 * age + w2 * affordability + bias
        y_predicted = sigmoid_numpy(weighted_sum)
        loss = log_loss(y_true, y_predicted)

        w1d = (1/n)*np.dot(np.transpose(age),(y_predicted-y_true)) 
        w2d = (1/n)*np.dot(np.transpose(affordability),(y_predicted-y_true)) 

        bias_d = np.mean(y_predicted-y_true)
        w1 = w1 - rate * w1d
        w2 = w2 - rate * w2d
        bias = bias - rate * bias_d

        print (f'Epoch:{i}, w1:{w1}, w2:{w2}, bias:{bias}, loss:{loss}')

        if loss<=loss_thresold:
            break

    return w1, w2, bias

gradient_descent(X_train_scaled['age'],X_train_scaled['affordibility'],y_train,1000, 0.4631)

post running for all epochs lets see the weights and bias we got:

This shows that in the end we were able to come up with same value of w1,w2 and bias using a plain python implementation of gradient descent function

def sigmoid(x):
        import math
        return 1 / (1 + math.exp(-x))

def prediction_function(age, affordibility):
    weighted_sum = coef[0]*age + coef[1]*affordibility + intercept
    return sigmoid(weighted_sum)

prediction_function(.47, 1)

We can use above two functions for prediction.

Thank You

要查看或添加评论，请登录

RANJIT PANDA的更多文章

Instruction Tuning, prompt Engineering with Llama 2...

2024年2月29日

Instruction Tuning, prompt Engineering with Llama 2...

Llama 2, developed by META research scientists, comprises a collection of models rather than a single model, indicating…
Neumorphism !!!

2022年7月26日

Neumorphism !!!

Best Thing to implement in Button or Logo or in any kind of images or box shape available in website. Here we gonna…
A Simple Spring Boot project With Added Authentication and Payment Gateway And CRUD operations !

2022年7月24日

A Simple Spring Boot project With Added Authentication and Payment Gateway And CRUD operations !

We gonna build a cool and simple project for learning and practice purpose . Technologies Used: Spring Boot Hibernate…
Why Microservice Over Traditional Monolithic !!!

2022年7月22日

Why Microservice Over Traditional Monolithic !!!

Let me start with a simple example! Sometimes we all must have faced issues while posting a post in LinkedIn or…
Student management Menu-Driven Java Program with JDBC, MYSQL.

2021年9月3日

Student management Menu-Driven Java Program with JDBC, MYSQL.

It is a Menu based java program. In this program, we will be operating CRUD in Our DB.

7 条评论
Javascript Use-cases!

2021年6月26日

Javascript Use-cases!

Today, JS stands at a dominating position in the software development industry with its ecosystem. It offers an…
Integrating Docker with Python CGI.

2021年6月26日

Integrating Docker with Python CGI.

?? In this task, you have to create a Web Application for Docker (one of the great Containerization Tool which provides…
Face Recognizing and Creating EC2 instance and EBS in AWS, Sending Whatsapp Message using ML model and python.

2021年6月24日

Face Recognizing and Creating EC2 instance and EBS in AWS, Sending Whatsapp Message using ML model and python.

When it recognizes your face then - ?? It sends mail to your mail-id by writing this is the face of your_name. ??…
Creating a Live Video Streaming App using Python.

2021年6月6日

Creating a Live Video Streaming App using Python.

Steps To Complete This Task: We will Create a Server and client. The server will send the video to the client and the…

12 条评论

See all articles

Demystify Neural Network. Implement A Simple NN using PYTHON Functions. Tensorflow/keras VS Python-Code...??

RANJIT PANDA

AI/ML Practitioner

Steps:

领英推荐

RANJIT PANDA的更多文章

社区洞察

其他会员也浏览了

It’s here: My new book on Language Models

Deep Learning Frameworks: Tools for Developing Advanced Models

New Book on Synthetic Data: Version 3.0 Just Released

Object Detection Using EfficientNet in Tensorflow 2

Applied Machine Learning: CNNs for Image Recognition

How to Classify the paintings of an artist using Convolutional Neural?Network

TensorFlow-Keras using Mnist Dataset

top 10 AI tools and frameworks

Algorithm, code, and mathematical complexities: introduction TENSORFLOW

How to build a GAN from scratch with library, PyTorch

Steps:

领英推荐

RANJIT PANDA的更多文章

Instruction Tuning, prompt Engineering with Llama 2...

Neumorphism !!!

A Simple Spring Boot project With Added Authentication and Payment Gateway And CRUD operations !

Why Microservice Over Traditional Monolithic !!!

Student management Menu-Driven Java Program with JDBC, MYSQL.

Javascript Use-cases!

Integrating Docker with Python CGI.

Face Recognizing and Creating EC2 instance and EBS in AWS, Sending Whatsapp Message using ML model and python.

Creating a Live Video Streaming App using Python.

社区洞察

其他会员也浏览了

It’s here: My new book on Language Models

Deep Learning Frameworks: Tools for Developing Advanced Models

New Book on Synthetic Data: Version 3.0 Just Released

Object Detection Using EfficientNet in Tensorflow 2

Applied Machine Learning: CNNs for Image Recognition

How to Classify the paintings of an artist using Convolutional Neural?Network

TensorFlow-Keras using Mnist Dataset

top 10 AI tools and frameworks

Algorithm, code, and mathematical complexities: introduction TENSORFLOW

How to build a GAN from scratch with library, PyTorch