登录查看更多内容

Artificial Intelligence #13: An easy maths-based strategy to understand machine learning and deep learning

Ajit Jaokar

发布日期: 2021年7月20日

Welcome to Artificial Intelligence #13

For this episode, I was originally going to post on a different theme, but I got quite a few comments on a post I made about maths on LinkedIn.

Because a few people found that post useful, I thought of expanding it a bit more on my approach of teaching AI using a maths based approach

I use a similar approach in my teaching #artificialintelligence at the #universityofoxford ?

Previously, I discussed about the significance of maths in learning AI.

So, to recap, there are mainly four things you need to understand machine learning and deep learning

·??????Probability theory

·??????Statistics

·??????Linear Algebra

·??????Optimization

So, in this post, I am going to show you a simple approach to understand machine learning deep learning based on maths knowledge that most of you already know (as a student in year 12 / A levels if you took a maths/ science-based degree)

Here is a chain of thought I use

The idea is you start with simple concepts and gradually add to them using familiar maths

Considering the limits of this article, I will illustrate a small number of steps – but even these can be hopefully useful to you.

What is a function

Let’s start with functions

In mathematics, a function is a binary relation between two sets that associates each element of the first set to exactly one element of the second set.

Image source: https://xaktly.com/MathFunctions.html

Our job, is the find this function …

Whether you consider statistics, machine learning or deep learning – it’s the same problem. But as we see below, the approach varies.

Function approximation

In the case of supervised machine learning, the concept of finding the missing function which maps two domains is called function approximation. Given a dataset comprised of inputs and outputs, we assume that there is an unknown underlying function that is consistent in mapping inputs to outputs in the target domain and resulted in the dataset. A function approximation problem asks us to select a function among a well-defined class that closely matches ("approximates") a target function in a task-specific way.?

Examples of simple functions and complex functions

Let’s start with the simplest case. In the diagram below, the simplest case is a straight line ie a linear relationship between x and y

Now for the graph below, there is no apparent functional relationship that exists between x and y. However, we know, from function approximation, that there is a function that maps the x to the y(f(x)

(source unknown)

To contrast to the linear relationship, in the diagram below, there is a function that separates the blue from the green dots (on the right) – but it’s a bit more complex than the first case because it is non linear

领英推荐

How do you select the right machine learning algorithm…

Machine Learning 2 年前

Complete Data Science BootCamp!

Free Online Courses With Certificates 1 年前

How to learn machine learning and deep learning (AI)…

Ajit Jaokar 4 年前

Stochastic vs deterministic

Now, there is one more important complication in the quest to find this missing function

In data science, we have stochastic processes – as opposed to deterministic functions. In fact, in general, most functions in real life are stochastic. In science, you may find deterministic functions (for example Celsius to Fahrenheit conversion – which has only one answer). In deterministic models, the output of the model is fully determined by the parameter values and the initial conditions. Stochastic models possess some inherent randomness. The same set of parameter values and initial conditions will lead to an ensemble of different outputs. Stochastic models are considerably more complicated. ?Hence, real-life models are complex because they have to cater for random noise

Bias Variance tradeoff

Extending the idea of finding a function mapping the inputs to the outputs, we have another idea i.e. a tradeoff between how much noise your function learns v.s. the risk of missing valid inputs. This is a tradeoff between two errors – bias and variance – hence bias / variance tradeoff. The variance is an error from sensitivity to small fluctuations in the training set. High variance may result from an algorithm modelling the random noise in the training data (overfitting). ?High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).

Image source: https://medium.com/greyatom/what-is-underfitting-and-overfitting-in-machine-learning-and-how-to-deal-with-it-6803a989c76

This also leads to two other well-known terms – overfitting and underfitting. So, overfitting means the function has learnt the noise. So high variance (sensitivity to noise). Underfitting means the function has not learnt enough relationships between the inputs and the outputs (high bias). This is a tradeoff.

Inference

There is of course no point in just learning a function. The whole point of learning is to infer / predict ie. to extrapolate the knowledge, you have learnt into new areas. That brings us into the realm of statistical inference. But that’s a whole complex subject for another time. But even now you see how connections can be built incrementally to learn new ideas from ones you already know.

Neural networks

So far, we have seen certain types of functions i.e. ?linear or non-linear. These are handled by traditional statistics and machine learning methods.

But what if you have a relationship which is complex and hierarchical i.e. ?images or text or video (as opposed to traditional tabular data).

To learn this hidden relationship, we need to provided examples to the neural network at a higher level of abstraction (from a corpus of text - it can detect relationship between words / from a set of images - we can detect what makes up the object - ex a cat has fur, whiskers etc)

The caveat of course is that the data must reflect the problem (capture the features) and you need many examples) because now you are learning both the function but also the structure of the data (and the structure maybe hierarchical). The number of training examples increases the more features you have and / or more hierarchal the relationship between the features because every layer of the neural network is learning one element and the subsequent layer is building on it. Ex the lowest layer learns pixels .. the next layer learns edges etc etc

Image source: https://www.deeplearningbook.org/

This brings us to the fact that ‘deep learning’ is best described as representation learning

In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task.

PS I don’t know the source of this diagram

If you have followed me so far, congrats. While there is much more to go, you would have learnt even from this flow – far more than most people know.?I enjoyed writing this because its also the approach I use for teaching

I enjoyed writing this

If you come from a pure development background, don’t ignore the maths as I said before significance of maths in learning AI.

And the good news is .. there are only four main ideas you need to know to understand the maths of AI

·???????Probability theory

·???????Statistics

·???????Linear Algebra

·???????Optimization

Artificial Intelligence

115,365 位关注者

Owen Betton

Cloud Engineer and Data Scientist

3 年

Great Article Ajit Jaokar

Katya Hansom

Strategic B2B SaaS Product Leader | Advisor | Legal Tech | Compliance

3 年

Excellent article Ajit Jaokar! Will be keen on seeing more from you on statistical inference in deep machine learning and how businesses can gain trust in deep learning flow outputs.

1 次回应

Cesare Gussago

Managing Director at Alvarez & Marsal

3 年

Very interesting Ajit Jaokar, thanks for sharing it Steven Coates!

Steven Coates

CSO at Chthonian | Experienced in AI Deployment at Scale | Available for Interim Advisory Roles

3 年

Ajit, I love this article. Very clear and relatively straightforward explanation. I shall direct my clients to this when they want to dig a bit deeper into AI. Thanks for sharing.

1 次回应

Srivatsa N.

Principal Consultant - Data Engineering

3 年

Very clear explanation. Thanks a lot Ajit. Helped me better understand the application of maths in AI.

1 次回应

查看更多评论

要查看或添加评论，请登录

Ajit Jaokar的更多文章

A glossary of Autonomous AI agents

2025年3月29日

A glossary of Autonomous AI agents

The Oxford AI summit is based on the theme of Autonomous AI agents - this event enables you to get a certificate from…

6 条评论
Vibecoding Research Publication

2025年3月28日

Vibecoding Research Publication

Background Previously, I shared about our publication on interdisciplinary research. I am pleased to say that we are…

4 条评论
Why is AI moving so fast? AI is more than software - the many world views of AI

2025年3月28日

Why is AI moving so fast? AI is more than software - the many world views of AI

Background Seeing vibecoding as proposed by Andrej Karpathy - a developer asked me - How come AI is AI moving so fast?…

7 条评论
LLMs as a wood wide web - Giant Associative Memory

2025年3月24日

LLMs as a wood wide web - Giant Associative Memory

We just announced our Oxford AI summit. If you want to meet me and our team in Oxford see The Oxford Artificial…

10 条评论
Are we reskilling - deskilling or unskilling developers

2025年3月22日

Are we reskilling - deskilling or unskilling developers

This week, when I presented at the European Parliament on AI - someone asked me a question after the talk Are we…

9 条评论
Demonstrating the power of deep research at EU Parliament presentation

2025年3月21日

Demonstrating the power of deep research at EU Parliament presentation

This week, I presented a talk at the EU parliament on AI In it, I shared how the task of MEP assistants could be…

9 条评论
The evolution of the AI Risk Register- the state of the art

2025年3月17日

The evolution of the AI Risk Register- the state of the art

As I write this, Alphabet is in talks to acquire a cybersecurity firm for 30 billion USD The whole #AI and…

4 条评论
Reskilling for AI - Building Tools is itself the learning experience

2025年3月16日

Reskilling for AI - Building Tools is itself the learning experience

Background The famous starting scene from Space Odyssey 2001 where the ape throws a bone which cuts into a spaceship -…

2 条评论
Creating a prompt to demonstrate meta-cognition using Role play and Socratic reasoning

2025年3月15日

Creating a prompt to demonstrate meta-cognition using Role play and Socratic reasoning

I shared this idea with my class It's adapted from a previous idea I developed for learners on Autism spectrum Using…

2 条评论
Multi-modal AI lab in collaboration with our digital twins course at the University Of Oxford

2025年3月12日

Multi-modal AI lab in collaboration with our digital twins course at the University Of Oxford

After the success of our collaboration in #AI and #agtech - which was recently covered by both Satya Nadella and Elon…

2 条评论

See all articles

Artificial Intelligence #13: An easy maths-based strategy to understand machine learning and deep learning

Ajit Jaokar

What is a function

Function approximation

Examples of simple functions and complex functions

领英推荐

Stochastic vs deterministic

Bias Variance tradeoff

Inference

Neural networks

Artificial Intelligence

115,365 位关注者

Ajit Jaokar的更多文章

社区洞察

其他会员也浏览了

Evolution of Machine Learning: From Regression to Transformers Models

Embark on Your AI Adventure: A Step-by-Step Guide for Software Engineers

A Tour of The Top 10 Algorithms for Machine Learning Newbies

A learning pathway through Machine Learning

Machine Learning Algorithms

Top 10 algorithms to learn and your Data Analytics journey

Demystifying Linear Algebra: Why It's Essential for Machine Learning

Importance of Mathematics in Artificial Intelligence!

"Master the Basics: The 8 Most Important ML Algorithms You Need to Know in 10 Minutes"

Mini-Batch Gradient Descent in PyTorch

What is a function

Function approximation

Examples of simple functions and complex functions

领英推荐

Stochastic vs deterministic

Bias Variance tradeoff

Inference

Neural networks

Artificial Intelligence

115,365 位关注者

Ajit Jaokar的更多文章

A glossary of Autonomous AI agents

Vibecoding Research Publication

Why is AI moving so fast? AI is more than software - the many world views of AI

LLMs as a wood wide web - Giant Associative Memory

Are we reskilling - deskilling or unskilling developers

Demonstrating the power of deep research at EU Parliament presentation

The evolution of the AI Risk Register- the state of the art

Reskilling for AI - Building Tools is itself the learning experience

Creating a prompt to demonstrate meta-cognition using Role play and Socratic reasoning

Multi-modal AI lab in collaboration with our digital twins course at the University Of Oxford

社区洞察

其他会员也浏览了

Evolution of Machine Learning: From Regression to Transformers Models

Embark on Your AI Adventure: A Step-by-Step Guide for Software Engineers

A Tour of The Top 10 Algorithms for Machine Learning Newbies

A learning pathway through Machine Learning

Machine Learning Algorithms

Top 10 algorithms to learn and your Data Analytics journey

Demystifying Linear Algebra: Why It's Essential for Machine Learning

Importance of Mathematics in Artificial Intelligence!

"Master the Basics: The 8 Most Important ML Algorithms You Need to Know in 10 Minutes"

Mini-Batch Gradient Descent in PyTorch