Black and white boxes: explaining the maths of machine learning
Background - black box models
I am considering using this idea to explain the maths of machine learning to students.?
We are used to calling machine learning and deep learning algorithms as ‘black boxes’
However, to understand the maths behind machine learning and deep learning algorithms, we may need to consider the idea of ‘black and white boxes’ - as I explain below
Machine learning algorithms can be expressed as a hidden function between x and y ie inputs and outputs?
In layman’s terms: Imagine you have a magic box. You can put something into this box (let's call it 'X'), and the box will give you something back (let's call it 'Y'). The magic box is doing something inside, but you can't see what it is. All you know is that whenever you put in a certain 'X', you'll get out a certain 'Y'.
So, saying "machine learning algorithms can be expressed as a hidden function between 'X' (inputs) and 'Y' (outputs)" is just a fancy way of saying: Machine learning is about figuring out the formula that transforms your inputs into the outputs you want, even if we can't see exactly how that formula works on the inside.
This is all well and good - but why can we not figure the mechanism of the back box?
Data driven approaches
Firstly, whatever the approach, the internal mechanism needs the parameters of the algorithm to be determined. In the simplest case of a straight line, there are two parameters (m and c) for the equation y = mix + c. In the case of deep learning and LLM models, the number of parameters are in the millions or the billions.
Now, black box approaches are data driven. Hence, they simultaneously work (based on model evaluation metrics) but also their mechanism is unknown (black box operations)
So, the next logical question is: what are the alternatives to a black box model??
Alternative to black box models
What is the alternative way of expressing a relationship between x and y?
That’s where the traditional / statistical approaches come in - and what we can see as the ‘white box’ i.e. we transformation is not hidden but is rather explicitly known to some degree.
Given x and y, you could express a relationship between them as?
Linear Regression: y = mx + c
Statistical Correlation: Correlation measures how closely two variables are related. For example, if? X increases and Y also tends to increase, they may have a positive correlation. This doesn't tell you exactly how? X causes? Y to change but indicates whether there's a relationship and how strong it is.
Rules-based Systems: Sometimes, the relationship between? X and? Y can be defined by a set of rules or logic. For instance, if X is "temperature," and Y is "state of water," then the rules could be simple: if? X is below 0°C,? Y is "ice"; if X is between 0°C and 100°C, Y is "liquid"; if? X is above 100°C,? Y is "steam".
Non-Linear Models: Sometimes, X and Y have a more complicated relationship that might involve curves, where increasing? X doesn't always increase Y in a straightforward way. This can involve polynomial equations, logarithmic or exponential functions etc
Decision Trees: These models use a tree-like graph or model of decisions and their possible consequences to express the relationship between inputs and outputs. Starting from a root, decision branches are created based on conditions or choices, leading to different outcomes or predictions for.?
Difference between the two approaches
The key difference between traditional machine learning approaches and the methods we discussed before (like linear regression, statistical correlation, rules-based systems, non-linear models, and decision trees) lies in how they learn and adapt, their complexity, and their interpretability.?
领英推荐
Learning and Adaptation: Machine Learning Approaches typically adjust their internal parameters based on the data. They're designed to learn complex patterns through a process of trial and error, using a large amount of data. This includes adapting to new data without being explicitly programmed to do so after the initial training. In contrast, statistical methiods dont ‘learn’ from data in the same way
Complexity: Machine Learning Approaches can be highly complex, especially with deep learning models, which can have millions of parameters. This allows them to capture very subtle and complicated patterns in data but at the cost of requiring a lot of computational resources. In contrast, traditional methods are generally simpler and more transparent. A linear regression model, for example, can be fully described by its slope and intercept. This simplicity can be an advantage when you need to explain your model's predictions clearly.
Flexibility and Application: Machine Learning Approaches are very flexible and can be applied to a wide range of complex tasks, such as image recognition, natural language processing, and predicting highly non-linear patterns. In contrast, while traditional algorithms have limitations in handling complex patterns as effectively as machine learning models, they are highly effective for simpler, well-defined problems. They are also useful when data is limited or when models need to be easily explained.
Implications - hidden functions and statistical tests
Thus, we have two options
Now, once you see it in this way then hidden functions and statistical tests are two sides of the same aspect.?
Statistical tests are procedures used to make decisions or inferences about populations based on sample data. Statistical tests provide a framework to evaluate hypotheses, assess relationships between variables, and determine the significance of predictive features.?
Thus, statistical tests provide the ‘white box’ mechanism instead of the data driven hidden function
It’s not all (statistically)black and white
It’s not (statistically) black and white :) - pun intended
Next steps?
If we extend the comparison of statistical tests vs hidden functions in machine learning, we need to list ML functions and statistical tests and see how statistical tests can be used with ML
Welcome thoughts
If you are a non developer and want to learn AI with me, please see Erdos Research Labs
You can meet me and our team at our Oxford AI summit
If you would like to study with me, see our courses
If you found this useful, you can sign up for my book
Image source: dall-e
International vagabond and vagrant at sprachspiegel.com, Economist and translator - Fisheries Economics Advisor
7 个月You may want to add a grey box. https://link.springer.com/chapter/10.1007/978-1-4471-1558-8_2