The widening web of effective altruism in AI security

The widening web of effective altruism in AI security

Welcome to another edition of ?? The AI Beat ??!

This week, I dig into the widening web of effective altruism connections in AI security — a follow-up on a story I wrote last week about how top AI labs like OpenAI and Anthropic, as well as influential policy think tanks like RAND, are concerned about securing LLM model weights.


A couple of days ago, a US AI policy expert told me the following: “At this point, I regret to say that if you’re not looking for the EA [effective altruism] influence, you are missing the story.”?

Well, I regret to say that, at least in part, I missed the story last week.

Ironically, I considered an article I published on Friday a slam-dunk. A story on why top AI labs and respected think tanks are super-worried about securing LLM model weights? Timely and straightforward, I thought. After all, the recently-released White House AI Executive Order includes a requirement that foundation model companies provide the federal government with documentation about “the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights.”?

I interviewed Jason Clinton, Anthropic’s chief information security officer, for my piece: We discussed why he considers securing the model weights for Claude, Anthropic’s LLM, to be his number one priority. The threat of opportunistic criminals, terrorist groups or highly-resourced nation-state operations accessing the weights of the most sophisticated and powerful LLMs is alarming, he explained, because “if an attacker got access to the entire file, that’s the entire neural network.” Other ‘frontier’ model companies are similarly concerned — just yesterday OpenAI’s new “Preparedness Framework” addressed the issue of “restricting access to critical know-how such as algorithmic secrets or model weights.”??

I also spoke with Sella Nevo and Dan Lahav, two of five co-authors of a new report from influential policy think tank RAND Corporation on the same topic, called Securing Artificial Intelligence Model Weights. Nevo, whose bio describes him as director of RAND’s Meselson Center, which is “dedicated to reducing risks from biological threats and emerging technologies,” told me that within two years it was plausible AI models will have significant national security importance, such as the possibility that malicious actors could misuse them for biological weapon development.?

The web of effective altruism connections in AI security

As it turns out, my story did not highlight some important context: That is, the widening web of connections from the effective altruism (EA) community within the fast-evolving field of AI security and in AI security policy circles.

That’s because I didn’t notice the finely woven thread of connections. Which is ironic, because like other reporters covering the AI landscape, I have spent much of the past year trying to understand how effective altruism — an “intellectual project using evidence and reason to figure out how to benefit others as much as possible” — turned into what many call a cult-like group of highly influential and wealthy adherents (made famous by FTX founder and jailbird Sam Bankman-Fried) whose paramount concern revolves around preventing a future AI catastrophe from destroying humanity. Critics of the EA focus on this existential risk, or ‘x-risk,’ say it is happening to the detriment of a necessary focus on current, measurable AI risks — including bias, misinformation, high-risk applications and traditional cybersecurity.?

EA made worldwide headlines most recently in connection with the firing of OpenAI CEO Sam Altman, as its non-employee nonprofit board members all had EA connections.?

Read the full story.


Some of our other top AI stories over the past week:

要查看或添加评论,请登录

社区洞察

其他会员也浏览了