登录查看更多内容

Is GPT4 getting dumber...

Rajib Deb

A technology leader specializing in data, AI and analytics architecture

发布日期: 2023年7月23日

The news is making the rounds this week. So, I wanted to validate it myself. I saw the output and the testing approach was uploaded in the below github location

There are 4 type of validations done

Validate the math solving capability of gpt
Answering "sensitive" questions
Code generation
Visual reasoning

I wanted to make a disclaimer that this article is based on my own personal research and observation. I am in no way establishing if GPT4 is degrading or not. What I wanted to highlight in this article is a potential gap in the validation approach. I saw it in #1 and #3 in this article, I focused on #1 which is on the math solving validation capability of the GPT. Over a period of time, I intend to dig dipper on rest of the three. Below is a recording of what I observed about the validation approach for #1

My validation approach is uploaded at

要查看或添加评论，请登录

Rajib Deb的更多文章

Diving deep into Magentic-One...

2024年12月29日

Diving deep into Magentic-One...

I did a deep dive into the Magentic-One(SingleThreadedAgentRuntime) code today. I wanted to understand how it has been…
The Rise of Agentic Architecture: Reflecting on 2024 and Envisioning the Future

2024年12月26日

The Rise of Agentic Architecture: Reflecting on 2024 and Envisioning the Future

2024: The Dawn of the Agentic Architecture Era This year, 2024, will likely be remembered as one of the most…

5 条评论
Magentic-One | An instantiation of thinker/actor pattern...

2024年12月26日

Magentic-One | An instantiation of thinker/actor pattern...

Multi-Agent systems are evolving to not only process information, but also act on it with human supervision…

1 条评论
Thinker/Actor pattern | Leveraging the reasoners in a multi-agent system...

2024年11月30日

Thinker/Actor pattern | Leveraging the reasoners in a multi-agent system...

..
Convergence of symbolic and connectionist AI...

2024年11月29日

Convergence of symbolic and connectionist AI...

..
Amazon Bedrock Flows...

2024年11月24日

Amazon Bedrock Flows...

When new technology arrives, the early years often demand significant custom effort to make it work for specific use…
Taxonomy, Ontology and Knowledge Graph...

2024年11月17日

Taxonomy, Ontology and Knowledge Graph...

..
The evolution from web of documents to web of knowledge...

2024年11月10日

The evolution from web of documents to web of knowledge...

..
Context is the king...

2024年11月10日

Context is the king...

..

1 条评论
Language is not enough...

2024年11月9日

Language is not enough...

As human, we don't just speak a language, we use language to convey our knowledge on a certain subject based on the…

See all articles

Is GPT4 getting dumber...

Rajib Deb

A technology leader specializing in data, AI and analytics architecture

Rajib Deb的更多文章

社区洞察

其他会员也浏览了

LLMs = Stochastic Parrots

Machine Learning and Its Algorithms to Know – MLAlgos

Predicting the hailstone sequence using a Temporal Fusion Transformer (Pytorch)

Dirt Simple Matrix Inversion: From Math to Complete Code

I ask Bing Chat to write Lean 4 ZFC (Part 1 of ?)

Week 6: Unraveling the Matrix Through A Linear Adventure

WEEK 9: Finding The Best Fit and Making Predictions with?Data

Ensemble Learning and RandomForests in R

GPTQ Quantization of Poro-34B LoRA fine-tuned LLM with S Group data

A Table for Two

Rajib Deb的更多文章

Diving deep into Magentic-One...

The Rise of Agentic Architecture: Reflecting on 2024 and Envisioning the Future

Magentic-One | An instantiation of thinker/actor pattern...

Thinker/Actor pattern | Leveraging the reasoners in a multi-agent system...

Convergence of symbolic and connectionist AI...

Amazon Bedrock Flows...

Taxonomy, Ontology and Knowledge Graph...

The evolution from web of documents to web of knowledge...

Context is the king...

Language is not enough...

社区洞察

其他会员也浏览了

LLMs = Stochastic Parrots

Machine Learning and Its Algorithms to Know – MLAlgos

Predicting the hailstone sequence using a Temporal Fusion Transformer (Pytorch)

Dirt Simple Matrix Inversion: From Math to Complete Code

I ask Bing Chat to write Lean 4 ZFC (Part 1 of ?)

Week 6: Unraveling the Matrix Through A Linear Adventure

WEEK 9: Finding The Best Fit and Making Predictions with?Data

Ensemble Learning and RandomForests in R

GPTQ Quantization of Poro-34B LoRA fine-tuned LLM with S Group data

A Table for Two