登录查看更多内容

FOD#44: How Far Are We?

TuringPost

Newsletter about AI and ML. ?? Sign up for free to get your list of essential AI resources ??

发布日期: 2024年3月12日

+ 关注

In today's edition:

What does research say about how far we are from AGI?
News from the usual subjects: Anthropic , Cohere , OpenAI , etc.
The freshest AI&ML research papers from the week of Mar 4 — Mar 10

Though the question, “How far are we from achieving human-level intelligence in machines (or AGI, or ASI)” predates the term “artificial intelligence” itself, it saw a significant resurgence on Twitter last week, prompted by the Musk vs. OpenAI lawsuit (Musk accuses OpenAI of abandoning open-source principles and prioritizing profit over safety, hindering the safe development of AGI.) But far more interesting were the papers and an article that came out last week tackling this question. Today, we will discuss “How Far Are We from Intelligent Visual Deductive Reasoning?”, “How Far Are We From Automating Front-End Engineering?”, and Stephen Wolfram’s article “Can AI Solve Science?” These papers offer fascinating explorations of the differences between human and artificial intelligence.

Intelligent Visual Deductive Reasoning

In “How Far Are We from Intelligent Visual Deductive Reasoning?”, researchers from Apple explore Vision-Language Models (VLMs), like GPT-4V, in visual-based deductive reasoning, a complex yet less studied area, using Raven’s Progressive Matrices (RPMs)*.

*Raven’s Progressive Matrices are a nonverbal intelligence test measuring abstract reasoning, using patterns to assess cognitive functioning without language.

What caught my attention was the finding that AI systems like VLMs struggle with tasks requiring abstract pattern recognition and deduction. The paper notes, “VLMs struggle to solve these tasks mainly because they are unable to perceive and comprehend multiple, confounding abstract patterns in RPM examples.” This inability to deal with abstract concepts marks a fundamental difference between computational processing and human cognitive abilities. Being a sophisticated pattern recognizer doesn’t equate to sentience.

Another intriguing point was the models’ overconfidence. The observation that “all the tested models never express any level of uncertainty” highlights the importance of doubt and uncertainty in human cognition, suggesting a nuanced aspect of intelligence that current AI lacks.

Automating Front-End Engineers

In “Design2Code: How Far Are We From Automating Front-End Engineering?”, researchers from 美国斯坦福大学 , Georgia Institute of Technology , 微软 , and Google DeepMind have developed a benchmark for Design2Code, aiming to evaluate how well multimodal LLMs convert visual designs into code. Here, the replacement of humans came closer. Despite some limitations, there were considerable advancements in using generative AI for converting designs into front-end code. It’s remarkable that “annotators think GPT-4V generated webpages can replace the original reference webpages in 49% of cases in terms of visual appearance and content; and in 64% of cases, GPT-4V generated webpages are considered better.” This finding challenges traditional notions of artistic and creative value, questioning whether creativity is uniquely human or can be algorithmically reproduced — or even surpassed.

However, significant limitations persist. VLMs struggle with “recalling visual elements from the input webpages and generating correct layout designs.” posing questions about understanding and interpretation.

So, the important question is actually not how far we are from AGI (whatever it is), but how we embrace human-AI collaboration most effectively.

AI Solving Science

In that sense, Stephen Wolfram ’s blog post “Can AI Solve Science?” serves as an excellent example. In the very beginning, he plainly states that AI cannot solve all scientific questions. However, there is significant value in AI assisting scientific progress. He discusses how LLMs can serve as a new kind of linguistic interface to computational capabilities, providing high-level “autocomplete” for scientific work. As he usually does, he emphasizes the transformative potential of representing the world computationally and suggests that pockets of computational reducibility* can be found by AI as well.

领英推荐

The Weekend @ ...

Generative AI 1 年前

This week's latest AI industry updates: January 7, 2025

SymphonyAI 1 个月前

This AI newsletter is all you need #95

Towards AI 10 个月前

*A pocket of computational reducibility — a fascinating concept introduced by Wolfram — is a situation or problem within a complex system where, despite the system’s overall unpredictability, predictable patterns or simplified behaviors emerge, allowing for easier understanding or calculation.

Wolfram argues that AI can significantly aid scientific discovery by providing new tools for analysis and exploration, but its ability to completely “solve” science is limited by fundamental principles such as computational irreducibility. The future of AI in science lies in its integration with human creativity and understanding, leveraging its strengths to uncover new knowledge within the constraints of what is computationally possible.

We might be able to survive without front-end developers (no offense intended), but scientists remain indispensable!

Enjoyed This Story?

This article was originally sent via email to the subscribers of our Turing Post newsletter. Subscribe for free and be the first to read the latest stories.

Read further on our website:

News from the usual subjects: Anthropic, Cohere, OpenAI, etc.
The freshest AI&ML research papers from the week of Mar 4 — Mar 10

FOD#44: How Far Are We?

TuringPost

Newsletter about AI and ML. ?? Sign up for free to get your list of essential AI resources ??

Intelligent Visual Deductive Reasoning

Automating Front-End Engineers

AI Solving Science

领英推荐

Enjoyed This Story?

Read further on our website:

Turing Post

2,400 位关注者

TuringPost的更多文章

社区洞察

其他会员也浏览了

#41 OpenAI’s “innovation,” LLM Quantization, Feature Selection, and more!

Quantization, Linear Regression, and Hardware for AI: Our Best Recent Deep Dives

??Top AI Papers of the Week: DeepSeek-R1, Humanity's Last Exam, Scaling RL with LLMs, Chain-of-Agents

This AI newsletter is all you need #5

Agentic AI: Anthropic's Computer Use Agent

How LLMs are Shaping Enterprise-Scale Applications

Why Is The Progress of Sora AI Important To Follow? The Good, The Bad, and The Future.

AI Innovations: Unveiling the Latest Breakthroughs

Build a Q/A system using Langchain and Clarifai.

AI Innovations: Unveiling the Latest Breakthroughs

Intelligent Visual Deductive Reasoning

Automating Front-End Engineers

AI Solving Science

领英推荐

Enjoyed This Story?

Read further on our website:

Turing Post

2,400 位关注者

TuringPost的更多文章

Topic 29: Inside the family of Smol models

Self-Optimizing Models, and Humanoid Robots Are Reshaping 2025

??#88: Can DeepSeek Inspire Global Collaboration?

????#10: Does Present-Day GenAI Actually Reason?

Topic 27: What are Chain-of-Agents and Chain-of-RAG?

Inside Eleven Labs’ Unicorn Journey: from a weekend project to $3.3 billion

??#87: Why DeepResearch Should Be Your New Hire

Topic 26: What is test-time compute and how to scale it?

??#86: Four Freedoms of Open AI

How do RL and SFT help models to adapt to new tasks?

社区洞察

其他会员也浏览了

#41 OpenAI’s “innovation,” LLM Quantization, Feature Selection, and more!

Quantization, Linear Regression, and Hardware for AI: Our Best Recent Deep Dives

??Top AI Papers of the Week: DeepSeek-R1, Humanity's Last Exam, Scaling RL with LLMs, Chain-of-Agents

This AI newsletter is all you need #5

Agentic AI: Anthropic's Computer Use Agent

How LLMs are Shaping Enterprise-Scale Applications

Why Is The Progress of Sora AI Important To Follow? The Good, The Bad, and The Future.

AI Innovations: Unveiling the Latest Breakthroughs

Build a Q/A system using Langchain and Clarifai.

AI Innovations: Unveiling the Latest Breakthroughs