登录查看更多内容

Black and white boxes: explaining the maths of machine learning

Ajit Jaokar

发布日期: 2024年4月9日

+ 关注

Background - black box models

I am considering using this idea to explain the maths of machine learning to students.?

We are used to calling machine learning and deep learning algorithms as ‘black boxes’

However, to understand the maths behind machine learning and deep learning algorithms, we may need to consider the idea of ‘black and white boxes’ - as I explain below

Machine learning algorithms can be expressed as a hidden function between x and y ie inputs and outputs?

In layman’s terms: Imagine you have a magic box. You can put something into this box (let's call it 'X'), and the box will give you something back (let's call it 'Y'). The magic box is doing something inside, but you can't see what it is. All you know is that whenever you put in a certain 'X', you'll get out a certain 'Y'.

So, saying "machine learning algorithms can be expressed as a hidden function between 'X' (inputs) and 'Y' (outputs)" is just a fancy way of saying: Machine learning is about figuring out the formula that transforms your inputs into the outputs you want, even if we can't see exactly how that formula works on the inside.

This is all well and good - but why can we not figure the mechanism of the back box?

Data driven approaches

Firstly, whatever the approach, the internal mechanism needs the parameters of the algorithm to be determined. In the simplest case of a straight line, there are two parameters (m and c) for the equation y = mix + c. In the case of deep learning and LLM models, the number of parameters are in the millions or the billions.

Now, black box approaches are data driven. Hence, they simultaneously work (based on model evaluation metrics) but also their mechanism is unknown (black box operations)

So, the next logical question is: what are the alternatives to a black box model??

Alternative to black box models

What is the alternative way of expressing a relationship between x and y?

That’s where the traditional / statistical approaches come in - and what we can see as the ‘white box’ i.e. we transformation is not hidden but is rather explicitly known to some degree.

Given x and y, you could express a relationship between them as?

Linear Regression: y = mx + c

Statistical Correlation: Correlation measures how closely two variables are related. For example, if? X increases and Y also tends to increase, they may have a positive correlation. This doesn't tell you exactly how? X causes? Y to change but indicates whether there's a relationship and how strong it is.

Rules-based Systems: Sometimes, the relationship between? X and? Y can be defined by a set of rules or logic. For instance, if X is "temperature," and Y is "state of water," then the rules could be simple: if? X is below 0°C,? Y is "ice"; if X is between 0°C and 100°C, Y is "liquid"; if? X is above 100°C,? Y is "steam".

Non-Linear Models: Sometimes, X and Y have a more complicated relationship that might involve curves, where increasing? X doesn't always increase Y in a straightforward way. This can involve polynomial equations, logarithmic or exponential functions etc

Decision Trees: These models use a tree-like graph or model of decisions and their possible consequences to express the relationship between inputs and outputs. Starting from a root, decision branches are created based on conditions or choices, leading to different outcomes or predictions for.?

Difference between the two approaches

The key difference between traditional machine learning approaches and the methods we discussed before (like linear regression, statistical correlation, rules-based systems, non-linear models, and decision trees) lies in how they learn and adapt, their complexity, and their interpretability.?

Ajit Jaokar 5 个月前

How are Jacobian and Hessian matrices used in machine…

Ajit Jaokar 3 个月前

Mathematical foundations of data science and AI:…

Ajit Jaokar 11 个月前

Learning and Adaptation: Machine Learning Approaches typically adjust their internal parameters based on the data. They're designed to learn complex patterns through a process of trial and error, using a large amount of data. This includes adapting to new data without being explicitly programmed to do so after the initial training. In contrast, statistical methiods dont ‘learn’ from data in the same way

Complexity: Machine Learning Approaches can be highly complex, especially with deep learning models, which can have millions of parameters. This allows them to capture very subtle and complicated patterns in data but at the cost of requiring a lot of computational resources. In contrast, traditional methods are generally simpler and more transparent. A linear regression model, for example, can be fully described by its slope and intercept. This simplicity can be an advantage when you need to explain your model's predictions clearly.

Flexibility and Application: Machine Learning Approaches are very flexible and can be applied to a wide range of complex tasks, such as image recognition, natural language processing, and predicting highly non-linear patterns. In contrast, while traditional algorithms have limitations in handling complex patterns as effectively as machine learning models, they are highly effective for simpler, well-defined problems. They are also useful when data is limited or when models need to be easily explained.

Implications - hidden functions and statistical tests

Thus, we have two options

We can learn the function from data (black box) OR
We can define the underlying mechanism as explicitly as we can

Now, once you see it in this way then hidden functions and statistical tests are two sides of the same aspect.?

Statistical tests are procedures used to make decisions or inferences about populations based on sample data. Statistical tests provide a framework to evaluate hypotheses, assess relationships between variables, and determine the significance of predictive features.?

Thus, statistical tests provide the ‘white box’ mechanism instead of the data driven hidden function

It’s not all (statistically)black and white

It’s not (statistically) black and white :) - pun intended

Some algorithms are used in both statistics and machine learning - for example linear regression
Some machine learning algorithms are interpretable - ex decision trees
The comparison of statistical tests vs hidden functions is a simplification. It excludes some other cases (ex rule based)

Next steps?

If we extend the comparison of statistical tests vs hidden functions in machine learning, we need to list ML functions and statistical tests and see how statistical tests can be used with ML

Welcome thoughts

If you are a non developer and want to learn AI with me, please see Erdos Research Labs

You can meet me and our team at our Oxford AI summit

If you would like to study with me, see our courses

Low code AI course at the university of oxford? for non developers

AI and digital twins

If you found this useful, you can sign up for my book

Image source: dall-e

Artificial Intelligence

114,651 位关注者

Rodney Beard

International vagabond and vagrant at sprachspiegel.com, Economist and translator - Fisheries Economics Advisor

7 个月

You may want to add a grey box. https://link.springer.com/chapter/10.1007/978-1-4471-1558-8_2

2 次回应

要查看或添加评论，请登录

查看全部

Black and white boxes: explaining the maths of machine learning

Ajit Jaokar

Background - black box models

Data driven approaches

Alternative to black box models

Difference between the two approaches

领英推荐

Implications - hidden functions and statistical tests

It’s not all (statistically)black and white

Next steps?

Artificial Intelligence

114,651 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

FAQ - Mathematical Foundations of Data Science

Machine Learning Types

How to learn machine learning and deep learning (AI) based on your high school maths

Machine Learning: A Bird's Eye View

10 Essential Machine Learning Algorithms Every Beginner Should Know

8 Must-Read Machine Learning Books: Introductory, Intermediate, Expert Level

Can someone without computer science background become expert on deep learning?

Top Machine Learning Tools to Master

Common machine Learning Algorithms

Top Machine Learning Algorithms You Should Know to Become a Data Scientist

Background - black box models

Data driven approaches

Alternative to black box models

Difference between the two approaches

领英推荐

Implications - hidden functions and statistical tests

It’s not all (statistically)black and white

Next steps?

Artificial Intelligence

114,651 位关注者

Using knowledge graphs to create dynamic learning paths

2024年11月30日

Generative AI for Creative Professionals - mapping workflows to tools

2024年11月28日

The 10X AI developer

2024年11月27日

How can you relate feature engineering to model evaluation?

2024年11月26日

Generative AI for Creatives - Reinterpreting the classics for the modern age using chatGPT : Proust and the Matrix

2024年11月25日

Low Code Data Scientist - learning from Grace Hopper

2024年11月24日

Generative AI in Creative Roles: Best Practices

2024年11月24日

However did Euler come up with the Euler’s identity?

2024年11月23日

AI Opportunities in the new Justice AI Unit in the UK

2024年11月22日

Artificial Intelligence: Generative AI, Cloud and MLOps (online) - an amazing set of speakers

2024年11月21日

社区洞察

其他会员也浏览了

FAQ - Mathematical Foundations of Data Science

Machine Learning Types

How to learn machine learning and deep learning (AI) based on your high school maths

Machine Learning: A Bird's Eye View

10 Essential Machine Learning Algorithms Every Beginner Should Know

8 Must-Read Machine Learning Books: Introductory, Intermediate, Expert Level

Can someone without computer science background become expert on deep learning?

Top Machine Learning Tools to Master

Common machine Learning Algorithms

Top Machine Learning Algorithms You Should Know to Become a Data Scientist