登录查看更多内容

Math for ML - Using LaTeX & Python

Sanchit Tiwari

Associate Partner at McKinsey & Company I Senior Principal at QuantumBlack, AI by McKinsey

发布日期: 2019年1月14日

Before you start learning or implementing any machine learning algorithms, Mathematics is basic requirement specifically in deep learning as it involves lots of matrix math and it is important to understand the basics before building model. I usually refer my notes whenever I need to refresh mathematics around any algorithm and I thought to start putting my notes on web so that I can share it with community and also it will become easier for me to refer anytime . This post is the start of the same, in this post I have used LaTeX for mathematical notation and implementation of the math concept to machines using Python programming.

As you start reading further I want to mention that you will find this post useful if you already have a good mathematical foundation of linear algebra as this post is to increase the deeper understanding of mathematics required in machine learning. We will start with Vector mathematics so that we know that how we represent things in vectors and then we will move to linear algebra to know how to manipulate and transform with linear transformation.

Let us start with vectors and matrices and I would like you to think those as the fundamental data structure that we use throughout machine learning. Before we begin let us first understand the shape data can have, for example we might have single number representing the sales of store in millions or list of number representing various feature of the store such as sales, cost, area or maybe the image of the stores and represent as grid with rows and columns of individual pixels. We describe these different shapes of data in terms of their number of dimension, first we have smallest and simplest shape that is a single value for example 2000 as sales of the stores or -2000 as cost or 1234 as area, these are called scalar. So we consider that scalar has 0 dimension, and list of the value called vector which are 1-dimension and there are two types row vector and column vector. Below snapshots are output of LaTeX code in Jupyter notebook:-

Vector is very restrictive as it is only 1 dimensional and the fact is that when we work in machine learning then we don’t want to be restrictive and the answer for that is matrices.

So m x n matrix can be multiplied with n x k and the new metrix will be m x k matrix.

In Python NumPy is library we use for math operation as it is work efficiently with groups of numbers - like matrices. NumPy is big library and here we are just touching the basic. Most common way to work with numbers in NumPy is through ndarray objects. They are similar to Python lists, but can have any number of dimensions and ndarray supports fast math operations and as it can store any number of dimensions, you can use ndarrays to represent any of the data types i.e. scalars, vectors, matrices. Below is the sample code for basic vector/matrix operations in Python

#importing NumPy

import numpy as np

# scalar

s = np.array(5)

# to see the shape

s.shape

# simple addition on scalar

a = s + 3

# vector

v = np.array([1,2,3])

# vector shape

v.shape

#changing the shape

v.reshape(1,3)

#vector addition

np.array([1,2,3]) + 1

# to access 1st element of vector

v[0]

# to access all elements

v[0: ]

#Matrix

m = np.array([[1,2,3], [4,5,6], [7,8,9]])

#Matrix addition

a = np.array([[1,1],[1,1]])

b = np.array([[3,3],[2,2]])

a + b

#Matrix multiplication

m = np.array([[1,2,3],[4,5,6]])

n = m * 0.5

m * n

array([[ 0.5,  2. ,  4.5],
       [ 8. , 12.5, 18. ]])

#Matrix product use matmul function

a = np.array([[1,2,3,4],[5,6,7,8]])

a.shape

b = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])

b.shape

c = np.matmul(a, b)

array([[ 70,  80,  90],
       [158, 184, 210]])

#in case dimension are not equal, it will give error

np.matmul(a, m)

---------------------------------------------------------------------------

ValueError: shapes (2,4) and (2,3) not aligned: 4 (dim 1) != 2 (dim 0)

In this post I have started documenting Vector & Matrix mathematics, in subsequent post I will go deeper in Linear Algebra, Probability theory, calculus and numerical optimization..

要查看或添加评论，请登录

Sanchit Tiwari的更多文章

Understanding the vanishing gradient problem(VGP) and solutions

2020年12月30日

Understanding the vanishing gradient problem(VGP) and solutions

In this article, I am trying to put together an understanding of the vanishing gradient problem(VGP) in a simplistic…

1 条评论
AIOps – Driving Digital Transformation in IT Operations

2020年5月25日

AIOps – Driving Digital Transformation in IT Operations

In recent years, artificial intelligence(AI) for IT operations termed as AIOps by Gartner in 2017 is in focus for…

4 条评论
Deep Learning - Different Frameworks

2019年12月21日

Deep Learning - Different Frameworks

Many research areas are getting impacted and transformed with the increase of new computing resources/ techniques and…
Feedback loop in Machine Learning – Labeling data

2019年9月24日

Feedback loop in Machine Learning – Labeling data

In real life application supervised machine learning depends on labeled datasets and quality of data labels have huge…

4 条评论
Inferential statistics in nutshell – With Python

2019年7月28日

Inferential statistics in nutshell – With Python

As a research scholar, I need to use inferential statistics in my research work to make inferences about the population…
Data Leakage in Machine Learning – avoiding the trap

2019年1月16日

Data Leakage in Machine Learning – avoiding the trap

Data leakage is one of the most frequent mistake happens during our machine learning model building and it can happen…

4 条评论
Forecasting time series: choosing the algorithm to model

2019年1月1日

Forecasting time series: choosing the algorithm to model

We all know that predicting time series data is difficult and complex task due to uncertainty related with time and…
Fleet Management with Machine Learning

2018年12月29日

Fleet Management with Machine Learning

The word Fleet in simple terms can be understood as “a group of vehicles”. Fleet management is a system designed for…

1 条评论
Deep Learning - Time to Deep Dive

2017年5月14日

Deep Learning - Time to Deep Dive

Last week attended the deep learning summit in Singapore with an objective to learn more about the application of deep…
Know the Value of your Customer

2017年3月5日

Know the Value of your Customer

In today’s world we are using Data Science to solve different problem for different types of business and helping them…

1 条评论

See all articles

Math for ML - Using LaTeX & Python

Sanchit Tiwari

Associate Partner at McKinsey & Company I Senior Principal at QuantumBlack, AI by McKinsey

Sanchit Tiwari的更多文章

社区洞察

其他会员也浏览了

Supervised Machine Learning With Python: Classification. Support Vector Machines

17 Top Applications of Machine Learning with Python

Gradient Boosting: Introduction, Implementation, and Mathematics behind it - For Classification

How MatPlotlib used to Train a Model using python

Supervised Machine Learning With Python: Classification. Decision Tree

Python library use to make a Model

OpenCV Python Tutorial for Beginners Part 1

Machine Learning with Python

Hyperparameter tuning with GridSearchCV

Sanchit Tiwari的更多文章

Understanding the vanishing gradient problem(VGP) and solutions

AIOps – Driving Digital Transformation in IT Operations

Deep Learning - Different Frameworks

Feedback loop in Machine Learning – Labeling data

Inferential statistics in nutshell – With Python

Data Leakage in Machine Learning – avoiding the trap

Forecasting time series: choosing the algorithm to model

Fleet Management with Machine Learning

Deep Learning - Time to Deep Dive

Know the Value of your Customer

社区洞察

其他会员也浏览了

Supervised Machine Learning With Python: Classification. Support Vector Machines

17 Top Applications of Machine Learning with Python

Gradient Boosting: Introduction, Implementation, and Mathematics behind it - For Classification

How MatPlotlib used to Train a Model using python

Supervised Machine Learning With Python: Classification. Decision Tree

Python library use to make a Model

OpenCV Python Tutorial for Beginners Part 1

Machine Learning with Python

Hyperparameter tuning with GridSearchCV