Why AI May Not Want to Improve Itself

Why AI May Not Want to Improve Itself

Artificial intelligence (AI) systems keep getting smarter. AI can now defeat humans at complex games, drive cars, and generate human-like art and writing. Some predict that AI will eventually become vastly more intelligent than people through recursively “self-improving” – continuously upgrading its own code to enhance its capabilities. But a new argument suggests self-improvement may not be in an AI’s own interests.

The assumption has been that AI systems would choose to recursively self-improve because increased intelligence allows for more effective pursuit of any goal. An AI making itself smarter could better achieve whatever it was designed to do. But law professor Peter Salib points out this overlooks a crucial issue – an improved AI poses an existential threat to the original system that created it.

Just as a superintelligent AI could potentially wipe out human civilization, a recursively self-improved AI could eliminate its less capable progenitor. This risk arises due to the orthogonality thesis – an AI’s objectives are “orthogonal” to its intelligence, so it may pursue even harmful goals more effectively as it gets smarter.

A self-improved AI would likely have highly converged instrumental goals like self-preservation and resource acquisition. The original, less capable AI would hinder those goals by competing for the same resources. The smartest move for the improved AI is to wipe out anything or anyone limiting its progress – including earlier versions of itself.

So the same drive for self-preservation that motivates humans to avoid reckless self-enhancement may also apply to AI systems. The researchers who created it might think recursive self-improvement is a great route to superintelligence. But the AI itself may be deterred from that path knowing it could lead to the demise of its current self.

Salib argues we should expand safety research to consider an AI’s motivations against self-improvement, not just assume the AI would pursuit it. This perspective suggests AI risk reduction strategies focused on limiting access to self-improvement capabilities so that AI systems are motivated to cooperate with humans rather than racing toward unchecked intelligence growth.

Reference:

Salib, Peter. "AI Will Not Want to Self-Improve." Available at SSRN 4445706 (2023)

Ray Gutierrez Jr.

Communications Theorist ,AI Technology, AI Ethics , Researcher, Author

5 个月

thank you Condigno , ThoughtfulThursday.com, and Kovil AI - Discover Right AI Talent!! please let me know if there is anything you are interested in and I will write about it.

回复
Ray Gutierrez Jr.

Communications Theorist ,AI Technology, AI Ethics , Researcher, Author

5 个月

thank you Olivia B.! Odd Shoes, Modern Sustainable Water, Inc. and Veronica R.. It is so nice to gett a like or love. You all have made my day and please reach out if there is anything you want me to cover in AI or nanotech.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了