How Tensorflow Calculates Gradients

How Tensorflow Calculates Gradients

Let's reveal the magic behind TF backward pass ! This is very important to know if you are to build complex deep learning modules with TF .

In deep learning models there are two phases

  1. Forward Pass
  2. Backward Pass

In the forward pass you will do bunch of operations and obtain some predictions or a scores .

For example in a convolutional neural net we should write functions for following in a stacking manner

  • Input Tensor
  • Convolution Operation (Sliding a K*K filter)
  • Applying Non-Linearity to neurons (RELU) in the feature maps
  • Max Pooling
  • Softmax for get logits
  • Cross Entropy Error

Then what ? We need to come backward


Now if you are playing with a CNN made of python numpy as in CS231 assignments you need to write a backward API too ;) .

Basically you have to implement the chain rule

Making sure you know differentiation :D :D

It's like for each above mentioned operations you have to implement their derivatives

Yes ! Yes ! I know it's sucks !

But in TF we only consider about writing forward pass API . How cool ..

What is the magic ?

There's no magic someone has implemented them for you ;)

Let's explore , how they do it !

When thinking of execution of a TF programme we all are familiar with following ,

  1. Graph Creation
  2. Session execution

Basically first one is for building the model and second one is for feeding the data IN and getting results . You can read more on them .

Always remember TF does each and everything on C++ engine.

Even a little multiplication is not executed on python.

Python is just a wrapper

The most important thing in Tensorflow graph is the Backward Pass Magic which we call

Auto Differentiation

There are two types of Auto Differentiation Methods

  1. Reverse mode - Derivation of single output w.r.t all inputs .
  2. Forward Mode – Derivation of all outputs w.r.t one input .

The basic unit of above two methods is the Computational Graphs

This is a simple computation graph for each operation .

I found a perfect introduction to Computational Graphs in Colah's Blog .

Calculus on Computational Graphs: Backpropagation

So basically for the each operation in the forward pass we mention tensorflow create it's graph connecting operations top to bottom .

Here's a simple TF computational graph visualized with Tensor-Board

The above graph is corresponding to simple equation ,

Output =dropout(Sigmoid(Wx+b))

Now this is simple! Even though we don't care about the backward pass TF automatically create derivatives for all the operations top to bottom (Loss Function to Weights )

when we start a session , TF automatically calculates gradients for all the deferential operations in the graph and use them in chain rule.

Basically TF c++ engine consist of following two things .

  1. Efficient implementations for operations like convolution , max pool ,sigmoid etc .
  2. Derivatives of forward mode operation .

Finally here's a nice graph I found in cs224 - Deep Learning for NLP .

Forward Pass is consist of this .



So the forward pass consists of ,

  1. Variables , place holders (weights -w, input-x , bias -b)
  2. Operations (nonlinear operation -ReLU , Softmax , Cross Entropy Loss etc)

In following graph You can clearly see how TF automatically creates backward pass

  • Left - Forward Pass Graph (This is what we create with TF )
  • Right - Backward Pass Graph (TF automatically create this )

You can see the connection lines between two graphs . It is also automatically get generated (Reverse Mode Auto Differentiation) .

They make the training process running by transferring data when doing Chain Rule .

That's that ! Flow with Tensor Flow :)

Rezaul Karim, Ph.D.

Staff Data Scientist @ALDI SüD | Data Science, ML, MLOps, LLMs, Knowledge Graphs | Opinions are my own

7 年

Fantastic article! Very useful.

Amalraj Chandrakumar

Electrical Project Engineer, Sentinel Power Services, Tulsa

7 年

ohh man! Well done..

回复
Simon Greenwood

CGI Partner | I assist organisations to accelerate enterprise Automation & AI. Strategy, People, Processes, Technology, Value

7 年

Brilliant work, entertaining too!

回复
Debanjan Chaudhuri, PhD

AI Expert || Procurement Digitalization

7 年

Nice explanations! keep up the good work.

要查看或添加评论,请登录

Shamane Siri的更多文章

社区洞察

其他会员也浏览了