AI Alignment
Created using OpenAI's DALL-E 2 image generator

AI Alignment

Past (context)

Alignment is one topic in the broader field of AI safety - similar to the question of nuclear safety, if humans create technologies which could pose an existential threat to humanity, what safeguards are we willing and able to put in place to protect us from those threats. One of the best known examples of the alignment problem was posed in 2003 by Oxford professor Nick Bostrum. He describes an innocuous program which he calls "paperclip maximizer" that is simply directed to improve the efficiency of paperclip manufacturing. But without any other directives that would constrain the paperclip goal, the machine could take destructive acts that, while maximizing paperclip production, redirect resources from other necessary human activities and ultimately threaten life itself. (Here is Nick Bostrum's 2015 TedTalk - worth watching...)

Computer scientist Norbert Weiner put this problem succinctly as long ago as 1960 when he wrote:

“If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere effectively … we had better be quite sure that the purpose put into the machine is the purpose which we really desire.”

The problem of alignment is complicated, however, in that it is very difficult for people to agree on what "we really desire." People have quite different objectives from each other and this creates a variety of behaviors that maximize their own outcomes without taking into consideration the needs and wants of others. We see this in industry (Martin Shrekli), in politicians (what does Putin want?), and even in our everyday interactions with neighbors (road rage). As researchers have pointed out, the example of our global response to Covid-19 is very informative about the challenges we face in responding to other existential threats, such as a runaway super intelligence (AI Alignment Forum post by Victoria Krakovna).

The Machine Intelligence Research Institute's co-founder Eliezer Yudkowsky in 2004 proposed a concept for solving this challenge which he calls coherent extrapolated volition. The basic idea would be to build into advanced AI systems the capacity to analyze and "recursively iterate" for all of humanity the converging desires of our species in order to incorporate supporting those desires into any course of action. Effectively, can we develop in an AI the capacity to understand and respect collective human needs as a counterbalance to any other goals that may be set for that AI. More recently Yudkowsky has expressed doubt that we will achieve our AI safety objectives (more complicated read here if you want to explore his views).

How we got here: "The potential for superintelligence lies dormant in matter." - Nick Bostrum. And if this idea is right, and human beings are capable of creating such a superintelligence, we should reasonably be concerned about whether or not it will have the best interests for humanity in mind. But even if we are able to devise ways, such as coherent extrapolated volition, which would be capable of understanding our best interests we will however find it difficult to ensure that people build such machines with the capability of respecting all of humanity's desires (and not just those of some small group or individual). If you agree that superintelligence is possible, then it seems logical that their is a danger of a misaligned AI. This could be through unanticipated consequences or through deliberate design by parties pursuing their own objectives. Research organization like MIRI are seeking to raise awareness of these issues and help create normative standards in the international research community to design solutions. And the larger AI development and research companies such as OpenAI have focused alignment research programs. We should all hope for their success.

要查看或添加评论,请登录

Ted Shelton的更多文章

  • Ada Lovelace

    Ada Lovelace

    We might imagine the rise of artificial intelligence is purely a modern story. But concerns about machine…

    8 条评论
  • Consumerization of Technology

    Consumerization of Technology

    (where we are now..

    11 条评论
  • AI Interregnum

    AI Interregnum

    An interregnum: where one epoch is fading and another struggles to emerge. I have these wildly disparate conversations.

    12 条评论
  • Quantum-Enhanced AI?

    Quantum-Enhanced AI?

    Wednesday evening I tried to go to sleep early as I had to get up for a flight the next day and then two full two days…

    6 条评论
  • Cargo Cults and the Illusion of Openness

    Cargo Cults and the Illusion of Openness

    In the South Pacific during the 1940s, indigenous islanders witnessed military planes landing with supplies. After the…

    8 条评论
  • From WIMP to AI

    From WIMP to AI

    Evolving Interfaces and the Battle Against Cognitive Overhead The GUI Revolution and Its Growing Complexity Graphical…

    20 条评论
  • Harvesting your data

    Harvesting your data

    Much has been written this week about DeepSeek - overreaction by the markets, handwringing about China, speculation…

    5 条评论
  • Cognitive Surplus

    Cognitive Surplus

    Clay Shirky's 2010 book Cognitive Surplus: Creativity and Generosity in a Connected Age recently came to mind as I…

    17 条评论
  • Enterprise AI adoption

    Enterprise AI adoption

    I am going to go out on a limb here and just say that everyone will be wrong. Including me.

    21 条评论
  • Predictions for 2025

    Predictions for 2025

    What should we expect from AI research and development in the coming year? Will the pace of innovation that we have…

    14 条评论

社区洞察

其他会员也浏览了