Lean In or Lean Back...?
Tobias Frenz
CEO Munich Re Singapore Branch, Head of Digital Solutions Life & Health, Asia Pacific Middle East & Africa at Munich Re (Group)
Over the Chinese New Year break I came across an article from Open AI on using LLMs to proof mathematical statements, which requires reasoning, creativity and deep domain expertise, not all a core quality of LLMs. That led me deep down into the rabbit hole of formal languages for math, theorem provers, LLMs and math co-pilots, which I wrote down here so I don't forget :).
Let me give you some context: in 2021 I wrote a LinkedIn article on "explainability" in AI (XAI) and mathematics (see here). With the advent of LLMs, explainability has reached a whole new dimension of complexity: LLMs are complex neural networks and their decision-making processes are often considered "black box" or opaque. This means that it can be difficult to understand how they arrive at a specific conclusion or output. But explainability is crucial in many applications, especially in sensitive areas such as healthcare, banking, re-/insurance or legal systems, where decisions made by AI models can have significant consequences. The lack of transparency in understanding why a particular decision was made can be a barrier to the adoption of AI systems in these domains. Add to this the tendencies toward hallucinations and lack of reproducibility, the actuary and layman underwriter in me currently wonders about its application for medical underwriting, claims decisioning and pricing as we are assessing life insurance applications worth hundreds of thousands or millions of dollars. A lot - financially - is at stake for the insurer and reinsurers. We want precise and consistent results when it comes to underwriting, pricing and claims, otherwise we might as well just roll a dice. That said, LLMs certainly have a bright future in insurance as my colleague Lovell Hodge Ph.D. points out in his worthwhile 3-series on "LLMs in Insurance" (see here).
In fairness to LLMs and its hype as jack-of-all trades, they never claimed to be deterministic nor that accurate, and where precision is needed, a domain specific plugin might be a better alternative, such as Wolfram Alpha's ChatGPT Plugin. Equally are we likely to see in-house adoptions of Underwriting and Claims specific LLMs and Co-Pilots, e.g. based on a reinsurer's underwriting manual (like Munich Re's MIRA) or claims manuals. Underwriting co-pilots will (for now) not fully automate underwriting but assist/augment the human underwriting process - as is the core idea to Munich Re's "Augmented Underwriting with AI" solution (see my earlier post).
And this is akin to what is currently happening in the mathematical world, where efforts are made to (1) digitize/formalize mathematical proofs and (2) augment these with LLMs.
In this respect, I came across a video by Fields medal recipient Terence Tao on "Machine assisted proofs". Here, Tao mentions how his proof of The Polynomial Freiman-Ruzsa?conjecture (PFR) has been formalized using LEAN, a functional programming language and interactive theorem prover developed by Microsoft Research.
What's the point of digesting/formalizing what has already been proven you wonder... but as mentioned in my aforementioned LinkedIn article "Explainability and why ABC is so hard", mathematical proofs can be long, tremendously complex and thus a major challenge for peer reviewers to truly understand, referee or confirm. For instance, there were over 220,000 (informal) mathematics papers submitted to arxiv alone in the past 5 years and many might not have undergone a rigorous peer-review. LEAN (like other theorem provers) is trying to eliminate this bottleneck by digitizing mathematics (formalizing an informal proof) and then enabling computers to automatically verify these mathematical theorems. Unlike informal proofs that rely on natural language and general reasoning, formal proofs follow a strict set of rules and logical symbols defined by a formal language. Proving a formal statement then involves a sequence of proof steps using tactics, which transform the statement into easier ones until fully proven.
From a computer science perspective, a formal proof is like a computer program (e.g. in C, Phyton) but its correctness can be established with theorem provers. The syntax of these formal languages is difficult to master as quite different to common programing languages like C or Phyton. One reason why mathematicians prefer stick to paper and pen, or rather LaTeX, as I did for my thesis.
To make this more tangible, the below shows the PFR theorem (written in LaTeX) and its proof followed by it's implementation in the formal language LEAN.
The above is somewhat straightforward to understand for readers with a mathematics background, but the following formalization in LEAN is gibberish to an untrained eye.
The above PFR conjecture and its proof builds on multiple other mathematical definitions, lemmas and theorems as is shown in this blueprint, each programmed in LEAN. To a programmer this might look like a software architecture blueprint.
领英推荐
Knowing how powerful LLMs have become in code generation (trained on large code libraries e.g. from Github), the link between theorem provers and LLMs now becomes clearer, i.e. broad idea is the following:
To build such a massive LEAN math database, one can think of converting mathematical papers (e.g. from arxiv) and scanned maths books into LateX, and then formalize these LaTeX documents into LEAN. This sounds like a simple step but has tremendous complexities (in a similar way as we fail over and again to accurately OCR handwritten doctors' notes). E.g. LaTeX documents may contain ambiguities or implicit assumptions that are clear to mathematicians but may not be explicit. Or LaTeX documents may not provide all the necessary details or may skip steps that are considered obvious by the author.
A tangible example is the formalization of Fermat's Last Theorem - there is no automatic transformer that converts such a complex, high level order proof from its informal style into LEAN. To do so, a 5 year, 1 million Euro project has recently been approved, underlining the immense complexity.
To close the loop of OpenAI's involvement in mathematical proofs. OpenAI has successfully proven that a machine can beat humans at complex games like chess or Go, using a combination of Monte Carlo Tree Search for decision-making, deep neural networks for position evaluation and reinforcement learning from self-play to iteratively improve its performance. Proving a math theorem could be seen as similar in that the objective is not to win a game but to proof a mathematical statement. And there are interims steps (action spaces) to get to the final proof (all of which can be proved by an automated theorem provers like LEAN); but there are two distinct challenges - and I cite here from their above article as they put it succinctly:
(i)?Infinite action space: not only does formal mathematics have an extremely large search space (like Go for example), it also has an infinite action space. At each step of a proof search, the model must choose not from a well-behaved finite set of actions, but a complex and infinite set of tactics, involving exogenous mathematical terms that have to be generated (e.g., generating a mathematical statement to be used as a witness, an object used in steps such as “there exists an?x?s.t. …”, or a cut, the introduction and the chaining of a lemma in the middle of a?proof).
(ii)?Lack of self-play: conversely to 2-player games, a prover is not playing against an opponent but against a set of statements to prove. When faced with a statement that is just too hard, there is no obvious reframing that will let the prover generate intermediary easier statements to tackle first. This asymmetry prevents naive application of the self-play algorithms that were successful with 2-player?games.
OpenAI subsequently released an updated paper titled "Improving mathematical reasoning with process-supervision" that refines its approach by having a different reward function used in their reinforcement learning, namely a function that rewards interim steps of its reasoning (process super-vision) instead of only rewarding if the proof is found (outcome super-vision), which seems more intuitive given how math proofs are developed.
With this approach they could proof a significant higher number of Math Olympiad problems as shown in the below graph.
The gist is that we are still a long way from from solving complex math problems with AI, but the advent of LLMs has given it a new impetus. In a similar way as we foresee LLMs play a important role in transforming the underwriting and claims operations of insurers to maybe ultimately reach autonomy level 5.
So, the answer to the title of this article is to "LEAN into LLMs" :)
Analytics | Applied Innovation | Finance | Health | Product | Technology Advocacy | Palantir Foundry
9 个月This is a very good read Tobias Frenz, thank you for sharing.
NSV Mastermind | Enthusiast AI & ML | Architect AI & ML | Architect Solutions AI & ML | AIOps / MLOps / DataOps Dev | Innovator MLOps & DataOps | NLP Aficionado | Unlocking the Power of AI for a Brighter Future??
9 个月That sounds like an intriguing deep dive into the world of math and formal languages! ??
Vice President Data and Adaptive Intelligence at Munich Re Canada
9 个月Thoroughly enjoyed this article and as one who spent many years having to prove and sadly re-prove many theorems in grad school this would certainly be a game changer. Well written!!!!
VP of Business Development Asia at MunichRe Automation Solutions
9 个月I can only hope to be half as productive during … any quiet time. Great stuff Tobias Frenz !
Actuary
9 个月Hi Tobias, it’s always good to read your post. You can explain difficult topics in rather simple and understandable ways. Look forward to your next post ??