登录查看更多内容

Why learning Linear Algebra is important for Machine Learning

Ramkumar Rani

Gen AI | Agentic AI | RAG | Agentic Automation | Data | Lifelong Learner

发布日期: 2018年9月20日

Very often, aspirant Machine Learning(ML) engineers ask why they need to study mathematical concepts such as linear algebra, calculus, probability and statistics to become an ML expert. In this post, I will focus on necessity of linear algebra in machine learning. Linear algebra is a mathematical area that is immensely helpful for all engineering specialization, particularly Machine Learning.

Linear Algebra is the branch of mathematics that is mainly concerned about linear equations. In Machine Learning, the linear equations along with linear transformations are quite useful in many algorithms such as Least Square, etc. In other words, linear algebra, by converting input vectors into outputs using linear transformations, is essential to understand ML algorithms.

Broadly, the following concepts of linear algebra are applied across ML algorithms:

Scalar, vector, matrix and tensors
Operations on vectors and matrices
Linear Dependence
Linear Span
Norms
Matrix inverse & identity matrices
Special types of matrices and vectors
Singular Vector Decomposition (SVD)
Eigendecomposition
Dimensionality reduction – Principal Component Analysis (PCA)

In this article, I will explore a handful of linear algebra concepts and show how they are helpful in machine learning and deep learning. To begin with, linear algebra helps to convert numbers into structural patterns such as scalar, vectors, matrices and tensors. These structural patterns open new avenues in using special operations on matrices such as matrix inversion, matrix multiplication, etc. In particular, tensors play a major role in image processing and pattern recognition. For example, you need 3D tensors to process RGB color images. More importantly, linear algebra helps to understand, visualize more complex patterns of data – over 3 dimensions.

In machine learning, linear algebra simplifies the complexity of the data and present data in a concise form. Especially, in deep learning (a specialized branch of ML), the values of each row of a neuron in the network can be represented as a vector.

A simpler definition for linear algebra structure:

Scalars: They represent a single number. Eg: 10, 10 is a scalar
Vectors: A vector is a one-dimensional array of numbers. Vectors can be either row or column level structure
Matrices: Matrices are 2-D structure and they represent both rows and columns. A specific matrix value is identified by row and column indices
Tensors: Tensors are multidimensional arrays, typically represent more than 2-D structure. For example, a RGB image will have a 3-D tensor

Linear Algebra supports various matrix operations such as transpose, addition, multiplication, and so on. Considering following matrix transpose:

This matrix transpose can be thought of a mirror image across the main diagonal. As you can see, this operation is helpful in many machine learning (deep learning) tasks, especially in image recognition.

Another linear algebra area that plays a crucial role in machine learning and deep learning is Matrix multiplication (product). We can multiply two matrices with different dimensions, as long as number of columns of first matrix is equal to number of rows in second matrix. For example, consider the following matrix product:

In this case, the matrix X has got i rows and k columns, the matrix Y with k rows and j columns – when you perform matrix multiplication on them, the operation would product a new matrix Z with i rows and j columns. Matrix multiplication is critical in neural networks, wherein you have inputs are passed to next layers as vectors or matrices and are multiplied with weight matrix to produce the output.

Another linear algebra concept that comes handy in deep learning is Orthogonal matrices. Vectors x and y are orthogonal to each other if x(transpose)*y=0. This implies that x and y have 90 degrees to each other. Orthonormal matrices are useful in deep learning – they can be used to initialize the weights to avoid vanishing / exploding gradients. The exploring or vanishing gradients problem arises in deep learning when you try to multiply matrices at many steps across the network.

Calculating Norms is another linear algebra topic that plays a major role in machine learning. Norms are used to measure length of vectors. A commonly used norm is L2 , which measures Euclidean distance between vectors. It is commonly denoted as:

For example L2 norm is used in Ridge regression. L1 norm is also quite popular in machine learning and is used algorithms such as Lasso and it is defined as:

Ridge and Lasso regressions are popular regularization algorithms in Machine Learning.

Understanding these linear algebra concepts, you can move into eigendecomposition involving eigen values and eigen vectors. Eigendecomposition is pre-requisite with Principle Component Analysis (PCA), one of the commonly used machine learning algorithm for dimensionality reduction. PCA is probably the oldest and best known of the techniques of multivariate analysis. The goal of the PCA is to compute principal components. In essence, the computation of principal components involves identifying eigen value & eigen vectors.

Finally, I would like to emphasis how linear transformations play an important role in neural network. Let us consider a 1-layer multi-layer perceptron (MLP). This network is defined as

Here W1 and b1 are weight matrix and bias vector for the first linear transformation for the input, g is the nonlinear activation function applied element-wise. And, correspondingly, W2 and b2 are terms for 2nd linear transform defined for the network.

From the above short introduction, it is very clear that linear algebra plays an immense role in machine learning and deep learning. This article highlights some of the linear algebra concepts behind machine learning and I am sure that you will agree that linear algebra is one of the secret weapons in mastering machine learning.

Finally, this explains relationship between ML / DL and linear algebra:

Ramkumar Rani

6 年

Hi Bruce, Thanks for commenting.??I have Masters in Analytics & Data Science (focus: Statistical Machine Learning) and I work as Sr. Data Scientist.? ?Depending on the time availability, I will write more related topics on statistical machine learning and deep learning. Ram

Bruce Chaplin

6 年

I'd love to know, Ramkumar, who introduced you to this topic?

查看更多评论

要查看或添加评论，请登录

Ramkumar Rani的更多文章

The Next Frontier of AI?-?AI?Agents

2024年8月5日

The Next Frontier of AI?-?AI?Agents

In late 2022, ChatGPT3 burst onto the scene, mesmerizing us with its sheer magical abilities. This revolutionary tool…

1 条评论
Machine Learning on Azure Cloud

2020年8月17日

Machine Learning on Azure Cloud

Data Science & Azure Machine Learning Service – An introduction Background I have been practicing data science for the…

1 条评论

Why learning Linear Algebra is important for Machine Learning

Ramkumar Rani

Gen AI | Agentic AI | RAG | Agentic Automation | Data | Lifelong Learner

Ramkumar Rani的更多文章

社区洞察

其他会员也浏览了

Machine Learning: A Bird's Eye View

#3. Math for ML Part 1: Linear Algebra

Unveiling the Enigma: An Introduction to the Mathematics of Machine Learning

Machine Learning Algorithms: A Deep Dive into Key Techniques

Machine Learning in 2024

Machine Learning Models: Understanding Their Types and Applications

A Comprehensive Guide to Machine Learning Algorithms in Data Science

Breaking Down Machine Learning Algorithms: A Beginner’s Guide to Linear Regression

From Equations to Intelligence: The Mathematical Roots in Machine Learning (Part-1: Linear Algebra and Calculus)

What is Machine Learning? what does it actually mean?

Ramkumar Rani的更多文章

The Next Frontier of AI?-?AI?Agents

Machine Learning on Azure Cloud

社区洞察

其他会员也浏览了

Machine Learning: A Bird's Eye View

#3. Math for ML Part 1: Linear Algebra

Unveiling the Enigma: An Introduction to the Mathematics of Machine Learning

Machine Learning Algorithms: A Deep Dive into Key Techniques

Machine Learning in 2024

Machine Learning Models: Understanding Their Types and Applications

A Comprehensive Guide to Machine Learning Algorithms in Data Science

Breaking Down Machine Learning Algorithms: A Beginner’s Guide to Linear Regression

From Equations to Intelligence: The Mathematical Roots in Machine Learning (Part-1: Linear Algebra and Calculus)

What is Machine Learning? what does it actually mean?