Overthinking can trip up not only people or where CoT doesn't help.
TuringPost
Newsletter about AI and ML. ?? Sign up for free to get your list of essential AI resources ??
美国普林斯顿大学 and 美国纽约大学 investigated 3 cases where Chain-of-Thoughts (CoT, or step-by-step thinking) can lead to worse outcomes in both humans and models:
? Implicit learning tasks (learning patterns without explicitly thinking about them),
? Visual tasks (recognizing images or objects at a glance)
? Learning with exceptions (where some rules don't always apply)
Research links human psychology insights to predict when CoT aids or hinders model performance. They used two criteria to identify where CoT might reduce performance.
? Does verbal thinking lower human performance?
? Do these limitations also apply to AI models?
And here are the key findings:
- Researchers used specific grammar rules (FSGs) to create "words," including 4400 tasks with words either matching or slightly altered from the pattern.
- The model identified words matching these patterns after examples.
Result: CoT negatively impacted model performance??
- The model views a person's face and chooses the same one from five options.
- Researchers generated 500 synthetic problems with 2500 unique faces. Each had one target face plus four others with similar features.
Result: Verbal reasoning misses fine visual details; simpler prompts work better for nuanced visual tasks.
- Vehicles were classified by features: one usually matching the label (80%), three unrelated, and a unique color for correct identification.
- The model must correctly label all vehicles.
Result: CoT prompting slows models by up to 4x.
However, in some cases, human limitations don’t apply to models because of key differences in how they process information.
These tasks are:
- Explaining logical inconsistencies
- Spatial intuitions
- Aggregating features for decision-making
Original paper: https://arxiv.org/pdf/2410.21333