I Stumped All AI Models with My First-Grader's Homework

I Stumped All AI Models with My First-Grader's Homework

Sitting at the dining table helping my daughter with homework when I hit a question with multiple possible interpretations. Curious, I tested it on several AI models - they all failed spectacularly.

Tell me what you think? What's the right answer. See below to see how AI fared.

Claude 3.7 Sonnet

OpenAI o1 Pro


OpenAI o3-mini-high


OpenAI 4.5


OpenAI o1


OpenAI 4o


Gemini Flash 2.0



?? Francesco ?? Cipollone

Reduce risk - focus on vulnerabilities that matter - Contextual ASPM - CEO & Founder - Phoenix security - ??♂? Runner - ?? Application Security Cloud Security | 40 under 40 | CSA UK Board | CSCP Podcast Host

1 周

Caleb????

回复
Mark Conklin

Principal Engineer at ARM

1 周

This isn't really a 1st grader test. It is an IQ test questions from the looks of it. It does seem very difficult from a spacial reasoning perspective. If you know, you know, if you don't, I believe that AI would struggle with this question.

回复
Max Solonski

I build effective cybersecurity programs, exceptional teams, and rational processes

2 周

Can we please stop confusing ourselves by testing non-deterministic generative technologies with deterministic tasks? It does not prove that AI is stupid. It proves that we are.

Christopher M. Babie

Protecting the technology to electrify & decarbonize the planet @ GE Vernova

2 周

“C” - as you rotate the figure you would get all other representations (A,B,D) except for C

Ajay Arora

Entrepreneur | Investor

2 周

Given the question is so broad, I would default to as simple an explanation as possible especially given the context that this was asked of a first-grader (context is king ???? imo) — not that other answers aren’t correct as well. In my interpretation, I would say A is the “correct” answer because all the rest have one square in the second row from the bottom while A has three. Happy to be wrong, would love to hear why?

回复

要查看或添加评论,请登录

Caleb Sima的更多文章

社区洞察

其他会员也浏览了