Why Germany did not defeat Brazil in the final, or Data Science lessons from the World Cup
Gregory Piatetsky-Shapiro
Part-time philosopher, Retired, Data Scientist, KDD and KDnuggets Founder, was LinkedIn Top Voice on Data Science & Analytics. Currently helping Ukrainian refugees in MA.
This article is based on a KDnuggets blog jointly written with Dan Clark.
The 2018 World Cup is over, with France defeating Croatia 4-2 in the final. It was a great match, to end a brilliant tournament, with the French deserved winners.
Before the tournament, KDnuggets (and many others) have published predictions, which generally had Germany vs Brazil in the final.
Fig. 1: Expected World Cup 2018 Brackets, with Germany vs Brazil in the final, as predicted by KDnuggets before the tournament start.
We predicted 13 of the last 16 (81.25%) correctly, with only Poland, Germany and Egypt missing out and Japan, Sweden and hosts Russia taking their places.
At the quarter-final stage, 4 of the 8 teams were correctly predicted (50%), Only one out of 4 (France) was correct at the semi-final stage, and we were 0 out of 2 for the final.
The other analysts also got it wrong. The FiveThirtyEight predictions had Brazil (19%), Spain (17%) and Germany (13%) all ahead of France (8%) as the winners. Gracenote’s predictions had the same three sides and even Argentina ahead of France. Predicting the World Cup is difficult.
Lessons
So why did everyone get it so wrong? Here are some Data Science lessons:
Human aspect
Human behavior has a lot of randomness and so trying to use data science to predict it is difficult and offers limited accuracy. One particular example from the World Cup is French goalkeeper Lloris mistake leading to the second Croatia goal in the final match. Something like this is impossible to predict, likewise with any own goals and mistakes, it simply comes down to human behaviour.
External factors
Sport in general contains a lot of external factors that can affect the results. For example for football ( soccer), the result may be affected by an unfair referee, adverse weather conditions, the climate, the player’s personal lives and much more. It’s very tricky to factor in these features, as they can be difficult to measure and collect.
Individual Events
Predicting the results in of the entire tournament requires predicting all the separate matches, and randomness tends to aggregate. The knockout nature of the World Cup makes is harder to predict, as one defeat can send a team home.
Group behaviour
Predicting sports with individual competition, like baseball or chess, is easier than predicting team events.
Data science has limited accuracy when dealing with predicting group behaviour. Because team composition is changing all the time in soccer, we cannot draw many conclusions from a team performance 4 years ago to predict the same team performance today (and what that team did 20 years has very little relevance).
Uncertainty range
Every prediction has a range of uncertainty. For example, if we throw a fair coin 1000 times, then from the binomial distribution (or normal approximation to binomial) we can predict that the number of heads will be between 469 and 531 with 95% confidence.
However, very few analysts do their predictions with sufficient rigor to determine it and present confidence intervals. If you see a prediction about a very uncertain event where the range of uncertainty is not given, can you trust the prediction?
Rules
With all that in mind, here are our three golden rules for knowing when to trust predictions:
- If there are mathematical laws (eg for games of chance like fair coin or dice) or physical laws (for example in astronomy, where positions of planets can be predicted very precisely).
- If there is a lot of data on the same type of entity. Note that the Brazil team of 2010 isn’t going to be the same as the Brazil 2018 team.
- If the predictions include a range of uncertainty, which usually indicates a good work with solid statistical foundations. When only a single number is provided without a standard deviation, it is probably more for entertainment, and you shouldn’t trust it.
Conclusion
This experience highlights how limited data science can be when predicting something controlled by human behaviour. It’s abundantly clear that these predictions are more for entertainment purposes, or to give a rough estimate, rather than an exact science.
Actuarial & Quantitative Analyst
6 年Data science allows us to make a decision with confidence interval but doesn't guarantee 100% future behavior.
Actuarial & Quantitative Analyst
6 年Let's check it.
Data Scientist / Machine Learning Engineer
6 年Probably they anayzed the wrong data as Albrecht Zimmeran says chance is a defining factor on football scores, so the analysis must include data that may be related to the "chance generation".
I would add how strongly soccer is affected by chance in the first place because it is so low-scoring. In basketball, with (more or less) 100 possessions per match, most of which lead to a basket, chance is much lower. And even basketball is *strongly* affected by chance (https://ceur-ws.org/Vol-1970/paper-09.pdf, if I may plug myself). In lower-scoring games such as American football, ice hockey, and soccer, it gets progressively worse.
Head, Center for AI/ML (formerly Center of Excellence in Analytics), Institute for Development and Research in Banking Technology
6 年Brilliant analysis of the failed predictions!