Observations on "Managing extreme AI risks amid rapid progress"

Observations on "Managing extreme AI risks amid rapid progress"

The paper “Managing extreme AI risks amid rapid progress” by Bengio et al. is the most important AI paper of 2024 (so far), IMO. The main question raised and addressed is: how serious is the AI safety risk and how to govern those risks?

Its assessment is summed best in these lines: “[W]e must take seriously the possibility that highly powerful generalist AI systems that outperform human abilities across many critical domains will be developed within this decade or the next.”

The paper makes the clearest case I have seen for being extremely cautious with potentially breakout AI technology (“RAPID PROGRESS, HIGH STAKES”):

1.??????? AI IS GETTING BETTER FASTER, a trend that is likely to accelerate due to continuing innovations in hardware and algorithms, punctuated by breakthroughs. What adds fuel to the fire is that “progress in AI also enables faster AI progress.” So, there’s a direct feedback loop. While they don’t say it as such, this all amounts to suggesting that AI improvement appears to have increasing returns to scale.

2.??????? THERE IS A POINT OF NO RETURN: What the authors are most concerned with is when AI systems get smart enough that they can both anticipate and scuttle human efforts to control them: “Once autonomous AI systems pursue undesirable goals, we may be unable to keep them in check.”

3.??????? THERE IS MISALINGMENT IN INTENTED OUTCOMES VS. TRAINING GOALS. A root cause of why (and how) AI systems could go rogue is the intentional (by malicious actors) and unintentional (by well-meaning AI developers) misalignment between the reward signals used to train AI systems and the intended outcomes.

?

To prevent the worst outcomes of AI safety risks from materializing, one of the central suggestions of the paper is to “REORIENT TECHNICAL R&D”. Reorient to align the AI systems better with human society goals. Given the dire straits (in the authors view) of AI safety research compared to the potentially extraordinary risks, the paper makes several recommendations to close the gap including improving oversight & honesty, increasing robustness and resilience, and inclusive AI development to mitigate biases and better represent the diversities in social values. Perhaps the strongest set of recommendations the authors make is toward much more directed R&D toward stronger evaluation of dangerous capabilities and for evaluating AI alignment: “Improved evaluation tools decrease the chance of missing dangerous capabilities…”. This is coupled with a “call on major tech companies and public funders to allocate one-third of their AI R&D budget… toward...ensuring AI safety and ethical use.”

?

The paper ends with a discussion on “GOVERNANCE MEASURES”, noting that current “governance plans fall critically short in view of the rapid progress in AI capabilities”. While directionally right, this part of the paper is the least certain and is even vague and a bit inconsistent in places:

1.??????? “We need fast-acting, tech-savvy institutions for AI oversight…national institutions need strong technical expertise and the authority to act swiftly.” This is an important need. But the paper is silent on its feasibility. Given what we know from the challenges and failures in climate change governance at national and international levels—an issue that the paper surprisingly doesn’t bring up in any context—I am NOT at all hopeful that the institutional requirements will be met per the timeline the authors outline (“this decade or the next”). And if the governance/institutional part is infeasible in a timely fashion, then the rest of measures the authors suggest are also on shaky grounds.

2.??????? The paper floats the idea of risk-adjusted governance: “If AI advances rapidly, strict requirements automatically take effect, but if progress slows, the requirements relax accordingly.” While an interesting idea, the paper does not build a convincing argument for why it will work. It might work in a highly transparent, coordinated, and responsive (global) ecosystem, where it’s clear who the actors are and the means to control them are precise and effective. But none of these criteria are met in the case of AI progress. ?I believe that the practicality and effectiveness of a risk-adjusted governance approach for managing extreme AI risk is bound to be limited given the highly heterogeneous set of international commercial, state, and non-state actors and objectives involved.

?

So, what then? As I recently summarized in an earlier post HOW SHOULD WE REGULATE AI?:

“Only relying on regulation will be too much or too little, definitely too late, and highly heterogeneous to have any meaningful impacts… effectively moving these companies and actors toward social good will mean persistent and effective shareholder, customer, stakeholder, and regulatory oversight, ALL COMBINED. That’s when real check can be effective and lasting.”


These paper raises many important questions. But there aren’t any clear and complete answers just yet. There certainly aren’t’ any simple ones.

要查看或添加评论,请登录

Varun Rai的更多文章

社区洞察

其他会员也浏览了