How Google is Expanding Reasoning Capabilities of Language Models
Michael Spencer
A.I. Writer, researcher and curator - full-time Newsletter publication manager.
What is Google AI's Minerva?
If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below). For the price of a cup of coffee, Join 79 other paying subscribers.
https://aisupremacy.substack.com/subscribe
Solving Quantitative Reasoning Problems with Language Models
Hey Guys,
This is a summary of Google's AI blog, about a recent paper that caught my interest.
If you think I write too much, I invite you to follow me instead of an Email on Substack’s iOS app, it’s?android app is coming soon. This will give you more control and agency in when to engage with the articles and a likely better reading experience.
Read AI Supremacy in the new Substack app
Now available for iOS
At the end of June, 2022 Google Research specifically?Ethan Dyer?and?Guy Gur-Ari, Research Scientists, at Google Research, Blueshift Team, released a new research paper summarized on Google AI’s blog.
Twitter embeds in LinkedIn Newsletters aren't working at they should, but here is a Tweet about it:
So why is this a big deal?
Language models have demonstrated remarkable performance on a variety of natural language tasks —?indeed, a general lesson from many works, including?BERT,?GPT-3,?Gopher, and?PaLM, has been that neural networks trained on diverse data at large scale in an unsupervised way can perform well on a variety of tasks.
As?Meta AI,?Microsoft Research?and?Google AI?zero in on human level performance of tasks, Quantitative reasoning is one area in which language models still?fall?far?short?of human-level performance.
The Beauty of Maths; Quantitative Reasoning
Solving mathematical and scientific questions requires a combination of skills, including correctly parsing a question with natural language and mathematical notation, recalling relevant formulas and constants, and generating step-by-step solutions involving numerical calculations and symbolic manipulation.
Due to these challenges, it is?often believed?that solving quantitative reasoning problems using machine learning will?require?significant?advancements?in model architecture and training techniques, granting models access to external tools such as Python interpreters, or possibly a more profound paradigm shift.
So let’s talk about their paper:
What is Minerva?
In “Solving Quantitative Reasoning Problems With Language Models” (to be released soon on the arXiv), they present Minerva,?a language model capable of solving mathematical and scientific questions using step-by-step reasoning.?They show that by focusing on collecting training data that is relevant for quantitative reasoning problems, training models at scale, and employing best-in-class inference techniques, we achieve significant performance gains on a variety of difficult quantitative reasoning tasks.
What does Minerva do?
Minerva solves such problems by generating solutions that include numerical calculations and symbolic manipulation without relying on external tools such as a calculator. The model parses and answers mathematical questions using a mix of natural language and mathematical notation.?Minerva combines several techniques, including?few-shot prompting,?chain of thought?or?scratchpad?prompting, and?majority voting, to achieve state-of-the-art performance on STEM reasoning tasks. You can explore Minerva’s output with our?interactive sample explorer!
领英推荐
In recent times Maths and code have been getting more attention.
A Model Built for Multi-step Quantitative Reasoning
To promote quantitative reasoning, Minerva builds on the?Pathways Language Model?(PaLM), with further training on a 118GB dataset of scientific papers from the?arXiv?preprint server and web pages that contain mathematical expressions using?LaTeX,?MathJax, or other mathematical typesetting formats.
Standard text cleaning procedures often remove symbols and formatting that are essential to the semantic meaning of mathematical expressions. By maintaining this information in the training data,?the model learns to converse using standard mathematical notation.
So this gets pretty interesting.
They then evaluated Minerva on OCWCourses, a collection of college and graduate level problems covering a variety of STEM topics such as solid state chemistry, astronomy, differential equations, and special relativity that we collected from?MIT OpenCourseWare.
So what caught my attention about this particular study and paper is that in all cases, Minerva obtains state-of-the-art results, sometimes by a wide margin.
What Minerva Gets Wrong
Minerva still makes its fair share of mistakes.
About half are calculation mistakes, and the other half are reasoning errors, where the solution steps do not follow a logical chain of thought.
It is also possible for the model to arrive at a correct final answer but with faulty reasoning. They call such cases “false positives”, as they erroneously count toward a model’s overall performance score.
Minerva doesn’t understand maths per se though.
Limitations
The teams’ approach to quantitative reasoning is not grounded in formal mathematics. Minerva parses questions and generates answers using a mix of natural language and?LaTeX?mathematical expressions, with no explicit underlying mathematical structure.
Future Directions
While machine learning models have become impressive tools in many scientific disciplines, they are often narrowly scoped to solve specific tasks. Google Research hopes that general models capable of solving quantitative reasoning problems will help push the frontiers of science and education.
Research by OpenAI and Google AI in particular are driving language models in new directions. Global researchers are showing significant improvements in quality and frequency of research given these new beginnings and tweaking of language models. I track papers on Synced and Market Tech Post among other blog summary websites.
This material is a recent blog article from Google AI's blog. There are so many good AI papers these days, it's getting hard to keep up. Increasingly I see my Newsletter AiSupremacy as yet another way for busy professionals to do this.
What do you think about the research and direction language models are heading?
If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below). For the price of a cup of coffee, Join 79 other paying subscribers.
https://aisupremacy.substack.com/subscribe
Thanks for reading!
Student at Amity University Patna
2 年Hlo sir
Storyteller | Linkedin Top Voice 2024 | Senior Data Engineer@ Globant | Linkedin Learning Instructor | 2xGCP & AWS Certified | LICAP'2022
2 年Interesting.. Insightful share ???? Michael Spencer
A.I. Writer, researcher and curator - full-time Newsletter publication manager.
2 年In my view, Microsoft Research and Google AI are doing even more important work in the early 2020s, and OpenAI have been a good investment for Microsoft thus far in the commercialization of GPT-3 among other things. Google Brain and Meta AI are shedding talent founding their own startups, that are important to watch. DeepMind still has the greatest concentration of AI talent out there, by a wide margin. China's AI community is also rapidly improving with a majority of researchers around the world of Chinese origin. Just check out AI papers to realize how true this is and what it means for the future. Language models are showing some serious potential as they scale and are optimized.