Math for ML - Using LaTeX & Python
Sanchit Tiwari
Associate Partner at McKinsey & Company I Senior Principal at QuantumBlack, AI by McKinsey
Before you start learning or implementing any machine learning algorithms, Mathematics is basic requirement specifically in deep learning as it involves lots of matrix math and it is important to understand the basics before building model. I usually refer my notes whenever I need to refresh mathematics around any algorithm and I thought to start putting my notes on web so that I can share it with community and also it will become easier for me to refer anytime . This post is the start of the same, in this post I have used LaTeX for mathematical notation and implementation of the math concept to machines using Python programming.
As you start reading further I want to mention that you will find this post useful if you already have a good mathematical foundation of linear algebra as this post is to increase the deeper understanding of mathematics required in machine learning. We will start with Vector mathematics so that we know that how we represent things in vectors and then we will move to linear algebra to know how to manipulate and transform with linear transformation.
Let us start with vectors and matrices and I would like you to think those as the fundamental data structure that we use throughout machine learning. Before we begin let us first understand the shape data can have, for example we might have single number representing the sales of store in millions or list of number representing various feature of the store such as sales, cost, area or maybe the image of the stores and represent as grid with rows and columns of individual pixels. We describe these different shapes of data in terms of their number of dimension, first we have smallest and simplest shape that is a single value for example 2000 as sales of the stores or -2000 as cost or 1234 as area, these are called scalar. So we consider that scalar has 0 dimension, and list of the value called vector which are 1-dimension and there are two types row vector and column vector. Below snapshots are output of LaTeX code in Jupyter notebook:-
Vector is very restrictive as it is only 1 dimensional and the fact is that when we work in machine learning then we don’t want to be restrictive and the answer for that is matrices.
So m x n matrix can be multiplied with n x k and the new metrix will be m x k matrix.
In Python NumPy is library we use for math operation as it is work efficiently with groups of numbers - like matrices. NumPy is big library and here we are just touching the basic. Most common way to work with numbers in NumPy is through ndarray objects. They are similar to Python lists, but can have any number of dimensions and ndarray supports fast math operations and as it can store any number of dimensions, you can use ndarrays to represent any of the data types i.e. scalars, vectors, matrices. Below is the sample code for basic vector/matrix operations in Python
#importing NumPy
import numpy as np
# scalar
s = np.array(5)
# to see the shape
s.shape
# simple addition on scalar
a = s + 3
# vector
v = np.array([1,2,3])
# vector shape
v.shape
#changing the shape
v.reshape(1,3)
#vector addition
np.array([1,2,3]) + 1
# to access 1st element of vector
v[0]
# to access all elements
v[0: ]
#Matrix
m = np.array([[1,2,3], [4,5,6], [7,8,9]])
#Matrix addition
a = np.array([[1,1],[1,1]])
b = np.array([[3,3],[2,2]])
a + b
#Matrix multiplication
m = np.array([[1,2,3],[4,5,6]])
n = m * 0.5
m * n
array([[ 0.5, 2. , 4.5], [ 8. , 12.5, 18. ]])
#Matrix product use matmul function
a = np.array([[1,2,3,4],[5,6,7,8]])
a.shape
b = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
b.shape
c = np.matmul(a, b)
c
array([[ 70, 80, 90], [158, 184, 210]])
#in case dimension are not equal, it will give error
np.matmul(a, m)
---------------------------------------------------------------------------
ValueError: shapes (2,4) and (2,3) not aligned: 4 (dim 1) != 2 (dim 0)
In this post I have started documenting Vector & Matrix mathematics, in subsequent post I will go deeper in Linear Algebra, Probability theory, calculus and numerical optimization..