Overthinking can trip up not only people or where CoT doesn't help.

Overthinking can trip up not only people or where CoT doesn't help.

美国普林斯顿大学 and 美国纽约大学 investigated 3 cases where Chain-of-Thoughts (CoT, or step-by-step thinking) can lead to worse outcomes in both humans and models:

? Implicit learning tasks (learning patterns without explicitly thinking about them),

? Visual tasks (recognizing images or objects at a glance)

? Learning with exceptions (where some rules don't always apply)

Research links human psychology insights to predict when CoT aids or hinders model performance. They used two criteria to identify where CoT might reduce performance.

? Does verbal thinking lower human performance?

? Do these limitations also apply to AI models?

And here are the key findings:


  • Implicit learning tasks:

- Researchers used specific grammar rules (FSGs) to create "words," including 4400 tasks with words either matching or slightly altered from the pattern.

- The model identified words matching these patterns after examples.

Result: CoT negatively impacted model performance??

Image credit: Original paper

  • Facial recognition test:

- The model views a person's face and chooses the same one from five options.

- Researchers generated 500 synthetic problems with 2500 unique faces. Each had one target face plus four others with similar features.

Result: Verbal reasoning misses fine visual details; simpler prompts work better for nuanced visual tasks.

Image credit: Original paper

  • Learning patterns with exceptions:

- Vehicles were classified by features: one usually matching the label (80%), three unrelated, and a unique color for correct identification.

- The model must correctly label all vehicles.

Result: CoT prompting slows models by up to 4x.

Image credit: Original paper

However, in some cases, human limitations don’t apply to models because of key differences in how they process information.

These tasks are:

- Explaining logical inconsistencies

- Spatial intuitions

- Aggregating features for decision-making


Original paper: https://arxiv.org/pdf/2410.21333

要查看或添加评论,请登录

TuringPost的更多文章

社区洞察

其他会员也浏览了