How Gemini and GPT4 completely messed a standard task that Claude 3 easily did.
Balaji Viswanathan Ph.D.
CEO, Brahmasumm. Building document AI at scale -- organizing, searching and summarizing enterprise data.
I wanted to try the best in class LLMs to understand a moderately complex table. This is a fairly standard handwritten invoice. It doesn't have nested tables or any monstrous stuff.
Let's see what each of them did. Gemini Advanced very confidently picked the task and made a complete mess of it. The number of items, their names and their prices are all wrong.
ChatGPT 4 tried it a few times and put a lot of effort and after 3 retries simply gave up.
The same in Claude 3 launched a few hours ago.
Product Manager - Innovation Lead, EY
1 年Arghya, ??Praveen FYI please
?? Sr Technical/Delivery Manager (.NET) - Digital Innovations & Transformation | ?? Technologist | ?? Cloud Computing | ?? Automation | ?? Data & AI Practitioner
1 年Hi Dr. Balaji Viswanathan Thanks for sharing this. I am struggling since last few weeks on similar usecase.. I will check on this now.
Building Real-World AI use-cases
1 年Well, from last 3 months I am trying to find/train LLMs to generate insights from a tabular data wih less than 100 rows and it's extremely difficult. I find LLMs cannot comprehend multi-modality of the data and it makes stupid mistakes like taing average of averages to summarize the data. Will give claude3 a try to see if it can comprehend it.