Can GPT-3 Solve Chemical Equations?
Ivan Reznikov
PhD, Principal Data Scientist || O'Reilly Book Author || TEDx/PyCon/GITEX Speaker || University Lecturer || LangChain, Large Language Models (LLMs) and Generative AI || 30K+ followers
Recently I've been testing ChatGPT with problems that theoretically can be considered solvable by a language model. I've already published a post on solving pub quiz problems (0/5 solved) and an article explaining an opening in chess.
I've already seen experiments on ChatGPT passing coding and other IT-related tests. Let's see how far we can get with the language model solving a simple chemical equation.
We'll ask ChatGPT to predict the reaction's outcome, classify it and balance the coefficients.
Task 1. Predicting reaction products.
The reaction we'll test is simple sodium combustion. As one can notice, ChatGPT does correctly answer "the reaction of sodium and oxygen will produce sodium oxide (Na2O)". It does lose some grip when it's asked to be more specific, replying with some generalistic text, but it seems not too bad.
Task 2. Balancing a chemical equation.
OK. The text on how to balance is correct, but the balance itself is poor. On the other hand, if the model believes there are 2Na atoms on both sides, it did the balance correctly. Let's check if we can reconvince ChatGPT with the number of sodium atoms and rebalance the equation.
Now we know that we have 2 atoms of sodium on the left and 4 atoms of sodium on the right. Will it make a difference now?
Nope. ChatGPT text model is quite sure the balance is right. Unfortunately not.
But from an ML perspective - that can be counted as a sign of not overfitting in this case. Low chance it saw the incorrect equation, so the coefficients are indeed generated. Interesting!
Task 3. Reaction classification.
领英推荐
Surprisingly, the chat model dropped the ball, acting as if it didn't know what reaction are we talking about. Let us put him back on track.
Despite doubts, the reaction was classified correctly, and the explanation is good.
This is a surprise, but the model scored 2 correct answers out of 3. One may start having thoughts it does know chemistry a bit. Is it, though?
Task 4. Bonus. Balancing an equation with artificial elements.
It doesn't matter to a chemist what to balance. Let's make up some elements to confuse the model. No chance it has seen our reaction before!
Nope. It can't balance if it does not consist of valid elements.
Nope. Can't classify an invalid reaction. Although the answer is quite obvious for any chemist.
But as I mentioned earlier in my previous article "ChatGPT: how easily one can get confused" the uncanny valley can also be applied to machine learning models.?
Seeing how extraordinarily the model behaves, one can get a false feeling the model can answer questions correctly. But this is not what ChatGPT was designed for. It answers the questions human-like, that's all. Don't fall into a trap, as there are situations where the errors will be difficult to detect. It's important to carefully review and validate the output from language models to ensure its accuracy and avoid falling into the uncanny valley, as the temptation to use ChatGPT or alternatives for other purposes may be too high.
I'm including one more screenshot that was generated in an attempt to pick a topic for this article. The text sounds impressively good, though the facts and numbers are completely made up.
Till next time!
Thermal energy & water treatment analyst, procedure & content writer, physics & chemistry tutor
1 年Kill the fear of genii, “Chemical Equations” When students hear and see a line composed of fused alphabets inform of caps and small letters in conjunction with fused numerals, they come to know that they have faced a pretty new science known as “Chemistry”. Let see how a chemical equ looks like.. https://howtofindwayout.blogspot.com/2023/05/what-is-chemical-equation-chemistry-parts-components-of-chemical-equation.html
Next Trend Realty LLC./wwwHar.com/Chester-Swanson/agent_cbswan
1 年Well said.
Senior Bioinformatics Scientist at Fraunhofer Institute for Translational Medicine and Pharmacology (ITMP)
1 年it will get better, but at the moment a simple redox cannot be recognized: Predict output HSCC(SH)CSH + C(#N)C1=C(C(=O)C(=C(C1=O)Cl)Cl)C#N.C(#N)C1=C(C(=O)C(=C(C1=O)Cl)Cl)C#N ... HSCC(SH)CSH + C(#N)C1=C(C(=O)C(=C(C1=O)Cl)Cl)C#N.C(#N)C1=C(C(=O)C(=C(C1=O)Cl)Cl)C#N → HSCC(SH)CSH.C(#N)C1=C(C(=O)C(=C(C1=O)Cl)Cl)C#N.C(#N)C1=C(C(=O)C(=C(C1=O)Cl)Cl)C#N
Ivan Reznikov Awesome! Thanks for Sharing! ??