Is GPT4 getting dumber...

The news is making the rounds this week. So, I wanted to validate it myself. I saw the output and the testing approach was uploaded in the below github location

There are 4 type of validations done

  1. Validate the math solving capability of gpt
  2. Answering "sensitive" questions
  3. Code generation
  4. Visual reasoning

I wanted to make a disclaimer that this article is based on my own personal research and observation. I am in no way establishing if GPT4 is degrading or not. What I wanted to highlight in this article is a potential gap in the validation approach. I saw it in #1 and #3 in this article, I focused on #1 which is on the math solving validation capability of the GPT. Over a period of time, I intend to dig dipper on rest of the three. Below is a recording of what I observed about the validation approach for #1

My validation approach is uploaded at


要查看或添加评论,请登录

Rajib Deb的更多文章

社区洞察

其他会员也浏览了