ChatGPT is getting better fast – are you still ahead of the AI in version 4?
Claudio Lupi
Partner @ Heads! International | Management Consulting, Executive Search
When ChatGPT became known to the general public about four months ago, I was somewhat disappointed by how many of the answers were simply wrong. ChatGPT-3.5 failed to solve problems above fourth or fifth grade of primary school. Asking for factual information often led to “AI hallucinations”, meaning stating for a fact what was pure invention.?
With the release of ChatGPT-4 in March, the AI has made a surprisingly big step forward. Hallucinations have become much rarer, and the AI usually provides sound and correct answers to technical questions, such as:?
Question: "How high must a bank's tier 1 capital be to prevent a bank run?" Answer: "There is no specific tier 1 capital level that can completely guarantee the prevention of a bank run, as bank runs are often driven by panic and a lack of confidence in the financial system rather than solely by a bank's capital adequacy. However, maintaining a strong capital buffer can help a bank weather financial shocks and maintain depositor confidence (....) Maintaining a strong tier 1 capital ratio is just one aspect of preventing a bank run; banks also need to manage their liquidity and demonstrate sound risk management practices to maintain depositor trust. (I have shortened the answer, it contained further details such as regulatory minimum requirements).?
Question: "How does inflation affect the cost/income ratio in motor insurance?" Answer (summarized): "...higher damage costs, higher operating costs, possibly reduced investment income due to inflation, premiums have to be raised, otherwise the cost/income ratio deteriorates."?
Of course, one can argue that these are "only" questions about factual knowledge. How does ChatGPT-4 cope if it has to derive the answer? Here, too, the new version does surprisingly well. ChatGPT-4 usually solves common brainteasers quite reliably. At the end of the text you may find examples of fairly complex questions which ChatGPT-4 answers correctly... would you have known the answer??
OpenAI, the company behind ChatGPT, had the AI take 34 academic and job-specific exams, including the US Uniform Bar Exam (a test for lawyers, the basis of the license to practice law), SATs (a US standardised test for university applications) and other tests ranging from medicine and physics to wine studies. Simply put, the previous version, ChatGPT-3.5, averaged about in the middle of all test takers, while the current version, ChatGPT-4, has made a big jump up in many tests, averaging about in the top 25% of all test takers. The most significant jump was in the bar exam, from the 10% worst to the 10% best participants. [GPT-4 Technical Report 2023b]?
This means that an AI performs similarly well or better in many subject-specific exams at university level when compared to most humans who have prepared specifically for these exams through higher education.
This does not mean that in the next two to three years many lawyers, accountants, MDs, managers or other "knowledge workers" will be replaced by an AI. However, the development of AI will continue at a rapid pace. It is foreseeable that in a few years time, AI will be able to solve many (if not most/all) professional problems as well as or better than most professionals.?
As we progress to that new situation, many questions will need to dealt with: How will we integrate AI into the various workstreams, what will the effects on productivity be? Will AI help to reduce the shortage of skilled workers or even lead to unemployment among highly educated specialists? Historically, technological progress has never led to less employment, and despite digitalization (in its largely pre-AI form), employment is at record levels in many countries presently. However, this does not prove that it will stay that way. ?
领英推荐
AI with broad, human-like problem-solving capabilities is a fundamental change, associated with opportunities as well as complex challenges for the whole society.
What is your view? Comments welcome.
?
Examples of questions that ChatGPT-4 solves without problems (solutions below):
??
Solutions by ChatGPT-4, delivered within seconds:?