登录查看更多内容

How Tensorflow Calculates Gradients

Shamane Siri

This is the Chinese translation of my profile.

发布日期: 2017年8月17日

+ 关注

Let's reveal the magic behind TF backward pass ! This is very important to know if you are to build complex deep learning modules with TF .

In deep learning models there are two phases

Forward Pass
Backward Pass

In the forward pass you will do bunch of operations and obtain some predictions or a scores .

For example in a convolutional neural net we should write functions for following in a stacking manner

Input Tensor
Convolution Operation (Sliding a K*K filter)
Applying Non-Linearity to neurons (RELU) in the feature maps
Max Pooling
Softmax for get logits
Cross Entropy Error

Then what ? We need to come backward

Now if you are playing with a CNN made of python numpy as in CS231 assignments you need to write a backward API too ;) .

Basically you have to implement the chain rule

Making sure you know differentiation :D :D

It's like for each above mentioned operations you have to implement their derivatives

Yes ! Yes ! I know it's sucks !

But in TF we only consider about writing forward pass API . How cool ..

What is the magic ?

There's no magic someone has implemented them for you ;)

Let's explore , how they do it !

When thinking of execution of a TF programme we all are familiar with following ,

Graph Creation
Session execution

Basically first one is for building the model and second one is for feeding the data IN and getting results . You can read more on them .

Always remember TF does each and everything on C++ engine.

Even a little multiplication is not executed on python.

Python is just a wrapper

The most important thing in Tensorflow graph is the Backward Pass Magic which we call

Auto Differentiation

There are two types of Auto Differentiation Methods

Reverse mode - Derivation of single output w.r.t all inputs .
Forward Mode – Derivation of all outputs w.r.t one input .

The basic unit of above two methods is the Computational Graphs

This is a simple computation graph for each operation .

I found a perfect introduction to Computational Graphs in Colah's Blog .

Calculus on Computational Graphs: Backpropagation

So basically for the each operation in the forward pass we mention tensorflow create it's graph connecting operations top to bottom .

Here's a simple TF computational graph visualized with Tensor-Board

The above graph is corresponding to simple equation ,

Output =dropout(Sigmoid(Wx+b))

Now this is simple! Even though we don't care about the backward pass TF automatically create derivatives for all the operations top to bottom (Loss Function to Weights )

when we start a session , TF automatically calculates gradients for all the deferential operations in the graph and use them in chain rule.

Basically TF c++ engine consist of following two things .

Efficient implementations for operations like convolution , max pool ,sigmoid etc .
Derivatives of forward mode operation .

Finally here's a nice graph I found in cs224 - Deep Learning for NLP .

Forward Pass is consist of this .

So the forward pass consists of ,

Variables , place holders (weights -w, input-x , bias -b)
Operations (nonlinear operation -ReLU , Softmax , Cross Entropy Loss etc)

In following graph You can clearly see how TF automatically creates backward pass

Left - Forward Pass Graph (This is what we create with TF )
Right - Backward Pass Graph (TF automatically create this )

You can see the connection lines between two graphs . It is also automatically get generated (Reverse Mode Auto Differentiation) .

They make the training process running by transferring data when doing Chain Rule .

That's that ! Flow with Tensor Flow :)

Rezaul Karim, Ph.D.

Staff Data Scientist @ALDI SüD | Data Science, ML, MLOps, LLMs, Knowledge Graphs | Opinions are my own

7 年

Fantastic article! Very useful.

1 次回应

Amalraj Chandrakumar

Electrical Project Engineer, Sentinel Power Services, Tulsa

7 年

ohh man! Well done..

Simon Greenwood

CGI Partner | I assist organisations to accelerate enterprise Automation & AI. Strategy, People, Processes, Technology, Value

7 年

Brilliant work, entertaining too!

Debanjan Chaudhuri, PhD

AI Expert || Procurement Digitalization

7 年

Nice explanations! keep up the good work.

1 次回应

查看更多评论

要查看或添加评论，请登录

Shamane Siri的更多文章

LLM Reasoning Era: Could Inverse Reinforcement Learning be the key to advancing LLM reasoning?

2024年10月26日

LLM Reasoning Era: Could Inverse Reinforcement Learning be the key to advancing LLM reasoning?

Yes, we have officially entered the "Reasoning Era." With the introduction of OpenAI's latest model O1, there's a…
Why Efficient Agent Communication is Key in Multi-Agent LLM Systems ?

2024年10月20日

Why Efficient Agent Communication is Key in Multi-Agent LLM Systems ?

The multi-agent LLM space is heating up, and some even say it might bring us closer to AGI. But what are multi-agent…

4 条评论
Step-wise Rewards in RLHF: Could This Be the Breakthrough Behind OpenAI's Strawberry Models?

2024年9月13日

Step-wise Rewards in RLHF: Could This Be the Breakthrough Behind OpenAI's Strawberry Models?

OpenAI's latest update hints at exciting advances in how reinforcement learning (RL) is applied to large language…
The computation of the reward within RLHF settings utilizing the TRL library

2023年7月16日

The computation of the reward within RLHF settings utilizing the TRL library

The Huggingface Transformer Reinforcement Learning (TRL) library simplifies Reinforcement Learning from Human Feedback…
My Transition from Ph.D. to Industry: A Thrilling First Six Months Journey!

2023年1月20日

My Transition from Ph.D. to Industry: A Thrilling First Six Months Journey!

So yeah, I completed my Ph.D.
Human-Like Decision Making - Generative Adversarial Imitaion Learning

2018年11月13日

Human-Like Decision Making - Generative Adversarial Imitaion Learning

In my last post(AI to Forcast Serious Things- Beyond Supervised Learning) I discussed how more human-centred decision…
AI to Forcast Serious Things - Beyond Supervised Learning

2018年5月5日

AI to Forcast Serious Things - Beyond Supervised Learning

Forecasting is the core of many applications right now! Being able to see things or feel things before they happen is a…
Why Inverse Reinforcement Learning Is GOLD!

2018年4月12日

Why Inverse Reinforcement Learning Is GOLD!

Inverse Reinforcement Learning(IRL) is not something very new. It popped up with work published by Andrew Ng in the…

4 条评论
Policy Gradients methods in RL

2018年3月14日

Policy Gradients methods in RL

Here's an easy guide to the paper Policy Gradient Methods for Reinforcement Learning with Function Approximation. Link…
Making Deep Learning Real - Memory Augmented Neural Nets

2017年12月11日

Making Deep Learning Real - Memory Augmented Neural Nets

Today I want to give some insights in to the paper called One-shot Learning with Memory-Augmented Neural Networks by…

2 条评论

See all articles

How Tensorflow Calculates Gradients

Shamane Siri

This is the Chinese translation of my profile.

Let's reveal the magic behind TF backward pass ! This is very important to know if you are to build complex deep learning modules with TF .

Even a little multiplication is not executed on python.

Auto Differentiation

In following graph You can clearly see how TF automatically creates backward pass

That's that ! Flow with Tensor Flow :)

Shamane Siri的更多文章

社区洞察

其他会员也浏览了

3D Fractal Dimension

What is PyTorch used for (practical use cases)

Review of Imperial College London's Professional Certificate in AI/ML (25 weeks) Course

What is PyTorch used for (practical use cases)

MLBP 9: ONNX Shakes up the Deep Learning Landscape and Numpy Drops Support for Python 2.7

8 Must-Read Machine Learning Books: Introductory, Intermediate, Expert Level

Real-time 'me-not_me' Face Detector

What is PyTorch used for (practical use cases)

Mastering Machine Learning: The Essential Tools To Watch Out For In 2023

Deep Learning in Python with TensorFlow and Keras API for creating AI algorithms/models. Sequential models.

Let's reveal the magic behind TF backward pass ! This is very important to know if you are to build complex deep learning modules with TF .

Even a little multiplication is not executed on python.

Auto Differentiation

In following graph You can clearly see how TF automatically creates backward pass

That's that ! Flow with Tensor Flow :)

Shamane Siri的更多文章

LLM Reasoning Era: Could Inverse Reinforcement Learning be the key to advancing LLM reasoning?

Why Efficient Agent Communication is Key in Multi-Agent LLM Systems ?

Step-wise Rewards in RLHF: Could This Be the Breakthrough Behind OpenAI's Strawberry Models?

The computation of the reward within RLHF settings utilizing the TRL library

My Transition from Ph.D. to Industry: A Thrilling First Six Months Journey!

Human-Like Decision Making - Generative Adversarial Imitaion Learning

AI to Forcast Serious Things - Beyond Supervised Learning

Why Inverse Reinforcement Learning Is GOLD!

Policy Gradients methods in RL

Making Deep Learning Real - Memory Augmented Neural Nets

社区洞察

其他会员也浏览了

3D Fractal Dimension

What is PyTorch used for (practical use cases)

Review of Imperial College London's Professional Certificate in AI/ML (25 weeks) Course

What is PyTorch used for (practical use cases)

MLBP 9: ONNX Shakes up the Deep Learning Landscape and Numpy Drops Support for Python 2.7

8 Must-Read Machine Learning Books: Introductory, Intermediate, Expert Level

Real-time 'me-not_me' Face Detector

What is PyTorch used for (practical use cases)

Mastering Machine Learning: The Essential Tools To Watch Out For In 2023

Deep Learning in Python with TensorFlow and Keras API for creating AI algorithms/models. Sequential models.