Explainability and why ABC is so hard
Tobias Frenz
CEO Munich Re Singapore Branch, Head of Digital Solutions Life & Health, Asia Pacific Middle East & Africa at Munich Re (Group)
Explainable AI is a prominent topic nowadays and whilst we can comfortably explain our tree-based machine learning models it can get pretty sketchy once it comes to deep learning models. But if you think you are having a hard time to explain how your AI model works you might sympathize with mathematician Shinichi Mochizuki, who for over eight years is trying to explain his claimed proof of the ABC Conjecture to the brightest minds in the mathematics community. But nobody really gets it. It's a fascinating story (at least for those that like maths) that bears some resemblance to Andrew Wiles's arduous journey to proof Fermat's Last Theorem ("FLT") as I will outline later.
When we discuss explainability in AI we think of being able to rationalise the output of an AI system. The U.S. National Institute of Standards and Technology (NIST) proposed four principles that should be covered under explainable AI:
- Explanation: AI systems should deliver accompanying evidence or reasons for all outputs. By itself, this principle does not require that the evidence be correct, informative, or intelligible; it merely states that a system is capable of providing an explanation. The easiest example is a tree-based model, which every human can easily understand. This principle does also not impose any metric of quality on those explanations. However, the below Meaningful and Explanation Accuracy principles provide a framework for evaluating explanations.
- Meaningful: Systems should provide explanations that are understandable to individual users. Generally, this principle is fulfilled if a user can understand the explanation, and/or it is useful to complete a task. This principle does not imply that the explanation is one size fits all: multiple groups of users for a system may require different explanations. The Meaningful principle rather allows for explanations which are tailored to each of the user groups. In an insurance context, it would mean that an underwriter would be able to understand how a predictive underwriting module derived its decision that an applicant is a standard or substandard risk.
- Explanation Accuracy: The explanation should correctly reflect the system’s process for generating the output. Together, the Explanation and Meaningful principles only call for a system to produce explanations that are meaningful to a user community. These two principles do not require that a system delivers an explanation that correctly reflects a system’s process for generating its output. The Explanation Accuracy principle thus imposes accuracy on a system’s explanations. Explanation accuracy is a distinct concept from decision accuracy. For decision tasks, decision accuracy refers to whether the system’s judgment is correct or incorrect. Regardless of the system’s decision accuracy, the corresponding explanation may or may not accurately describe how the system came to its conclusion.
- Knowledge Limits: The system only operates under conditions for which it was designed or when the system reaches a sufficient level of confidence in its output. By identifying and declaring knowledge limits, this practice safeguards answers so that a judgment is not provided when it may be inappropriate to do so. This principle can increase trust in a system by preventing misleading, dangerous, or unjust decisions or outputs.
Explainability in mathematics is rather rigid in that a proof is subject to the peer review of mathematical experts in the respective field. A mathematical proof doesn't have to be "understandable" by lay mathematicians. Though, in the history of mathematics there are some proofs who made it to the "mainstream" and become widely known to the public. I will touch on Fermat's Last Theorem, The Four Color Problem and the ABC Conjecture to outline how these vary in terms of explainability.
Fermat's Last Theorem
Fermat's Last Theorem (FLT) simply states that: For any integer n > 2, the equation a^n + b^n = c^n has no positive integer solutions.
Around 1637 French lawyer and mathematician Pierre de Fermat, whilst studying Diophantes' "Arithmetica" wrote this theorem on the margin of the book, stating that he found a proof that is "too large to fit into the margin". Right, whether he really had a valid proof we'll never know, chances are that he just thought he had one. For centuries mathematicians cracked their heads in vain to prove this theorem. It was only in 1984 when German mathematician Gerhard Frey suggested that one way to go about it would be via the also unproven Taniyama–Shimura conjecture. Two years later Ken Ribet indeed proved that Frey's assumption was right, in that solving the Taniyama-Shimura conjecture for a certain class of elliptic curves would imply FLT. Boom! Instead of proving FLT's theorem directly, there was a bridge/detour to do so: English mathematician Andrew Wiles's did exactly that and famously gave an internationally celebrated lecture in June 1993 to his peers in Cambridge to release his proof. To his horror though, a fellow mathematician later found a flaw in his workings that took him over a year to fix in collaboration with his associate Richard Taylor. End good, all good.
The proof consist of two papers and you can download them here: Fermat's initial paper: Link Fermat-Taylors second paper to fix the error: Link.
The proof is 129 pages long and whilst obviously non-trivial, peers reviewers could well follow it's reasoning and thus the proof is internationally accepted. So, no "explainability" problem here.
Another well-known proof was more controversial though and till today is accepted by mathematicians with a "Yes, but...".
The Four Color Theorem
This theorem basically states that no more than four colors are required to color the regions of a map so that no two adjacent regions have the same color.
It was first proposed by South-African mathematician Francis Guthrie in 1852 when he noticed that it only takes four different colors to color the map of England. Simple as it sounds, the theorem was too hard to prove with traditional mathematical methods. In the '60-70ies, German mathematician Heinrich Heesch developed pioneering work using computer to search for a proof. This was an enabler for a new solution and the theorem was finally proved in 1976 by Kenneth Appel and Wolfgang Haken using computer-aid: In simple terms, they applied a "proof by contradiction", ie assuming that maps exist requiring more than five colors. This condensed the problem down to proving it for less than 2,000 possible map configurations. This was done using a computer and thus their proof was a hybrid of pure mathematics (of only 62 pages - download here) and computer aid with 400 pages of microfiche.
Some mathematicians had a really hard time accepting the computer-assisted proof, but it's fair to say that this proof is now accepted by most. In terms of explainability, there was no questioning of the mathematical part of the proof as it was well understood by experts. A number of mathematicians questioned the computer-generated output though, and this is akin to the explainability issue we have with AI models.
Fermat's and the Four Color Theorem's proofs were not without drama, but had a happy ending after all. This cannot be said of the ABC conjecture...
ABC Conjecture
The ABC conjecture unfortunately isn't easy as ABC. First proposed by Joseph Oesterlé (1988) and David Masser (1985), it is stated in terms of three positive integers, a, b and c that are relatively prime and satisfy a + b = c. If d denotes the product of the distinct prime factors of abc, the conjecture essentially states that d is usually not much smaller than c. In other words: if a and b are composed from large powers of primes, then c is usually not divisible by large powers of primes.
Don't try too hard to understand the conjecture. What is noteworthy is that a number of important conjectures or theorems in number theory could be followed from ABC. Like Fermat's Last Theorem. Proving this theorem would be a groundbreaking achievement.
In 2012, Japanese mathematician Shinichi Mochizuki released a claimed proof of the conjecture in a series of four papers amounting to over 500 pages. The papers are incredibly complex as based on a newly developed theory ("inter-universal Teichmüller theory") and even hard-core mathematicians seem to struggle to follow the logic. In terms of explainability the proof seems like a clear "fail". That obviously doesn't imply it's wrong.
But in 2018 two German mathematicians, Peter Scholze (a Fields medal winner) and Jakob Stix, announced that they have detected a serious, unfixable flaw in the proof. And to date, they have not been proven wrong and could not be convinced otherwise by Shinichi Mochizuki, who continues to make his case.
It's somewhat heartbreaking to see this drama unfold and I don't have the brains to take a side here. The vast majority of mathematicians don't accept the proof of Mochizuki. You wish for a happy ending here. Otherwise, you could think that Mochizuki would have been better off to do as Fermat did and just scribble a small note somewhere to claim his ground.
Whatever the outcome, this is a prime example of how important explainability is in life...
You can download the updated papers in pdf here: Paper 1 Paper 2 Paper 3 Paper 4.