METR

非盈利组织

Berkeley，CA 537 位关注者

关注

查看全部 39 位员工

关于我们

METR works on assessing whether cutting-edge AI systems could pose catastrophic risks to civilization.

网站: https://metr.org/
METR的外部链接
所属行业: 非盈利组织
规模: 11-50 人
总部: Berkeley，CA
类型: 非营利机构
创立: 2022

地点

主要

US，CA，Berkeley

获取路线

METR员工

查看全部员工

动态

METR

537 位关注者
1 周
举报此动态
What would AI agents need to do to establish resilient rogue populations? Are there decisive barriers (e.g. KYC)? How likely are rogue AI populations to reach a large scale? How hard would it be to shut them down? We are sharing some analysis of these questions: https://lnkd.in/gwHnDKVm In the hypothesized "rogue replication" threat model, AI agents earn revenue, buy compute, and earn more revenue until they have established a large population. These AI agents are rogue (not directed by humans) and might represent a hazardous new type of threat actor. We did not find any *decisive* barriers to large-scale rogue replication. To start with, if rogue AI agents secured 5% of the current Business Email Compromise (BEC) scam market, they would earn hundreds of millions of USD per year. Rogue AI agents are not legitimate legal entities, which could pose a barrier to purchasing GPUs; however, it likely wouldn’t be hard to bypass basic KYC with shell companies, or they might buy retail gaming GPUs (which we estimate account for ~10% of current inference compute). To avoid being shut down by authorities, rogue AI agents might set up a decentralized network of stealth compute clusters. We spoke with domain experts and concluded that if AI agents competently implement known anonymity solutions, they could likely hide most of these clusters. Although we didn't find decisive barriers to rogue replication, it's unclear if replicating rogue agents will be an important threat actor. This threat model rests on many conjuncts (model weight proliferation, large-scale compute acquisition, resilience to shutdown, etc). In particular, rogue AI populations might struggle to grow due to competition with human-directed AI. We notice a counterbalancing dynamic where, if models are broadly proliferated, they face fierce competition, and if not, rogue AI agents are less likely to emerge at all. METR is not currently prioritizing the specific rogue replication threat model described here; however, we think some of the capabilities involved are important to evaluate (e.g. autonomy, adaptation, etc) and could amplify risks from AI agents.
赞评论分享
METR

537 位关注者
1 个月
举报此动态
Here's more on the major support for our work through The Audacious Project. This funding will enable METR and RAND to develop methods to measure the capabilities of AI systems, perform third-party evaluations, and help decision makers in using empirical testing for risk management.
The Audacious Project

27,726 位关注者
1 个月

Introducing Project Canary ?? AI is advancing rapidly, with experts predicting various forms of AI-enabled prosperity and catastrophe. Project Canary is advancing the empirical science of testing for specific risks, to ensure the benefits of AI can be realized. Like a canary in a coal mine, Project Canary will develop methods to alert society to potential AI dangers in time for effective action. This will let society make informed decisions about the direction of this transformative technology. ?? https://lnkd.in/e5i9cD9u #AudaciousProject
赞评论分享
METR

537 位关注者
1 个月
举报此动态
We’re honored to be part of collaborative funding initiative The Audacious Project’s 2024 cohort. $17M in new funding will support and expand our work empirically assessing the risks of frontier AI systems. More here: https://lnkd.in/gj-dVx2S

New Support Through The Audacious Project

metr.org

赞评论分享
METR

537 位关注者
1 个月
举报此动态
Vivaria, METR's open source evaluations and agent-elicitation tool, now supports GPQA Diamond, GAIA, HumanEval, PicoCTF, and other evals out-of-the-box to compare models and agents. Get started:?vivaria.metr.org Example eval suites:?https://lnkd.in/gsAWbtJq

Vivaria

vivaria.metr.org

赞评论分享
METR

537 位关注者
1 个月
举报此动态
METR provided a public comment on the U.S. AI Safety Institute’s valuable draft document “Managing Misuse Risk for Dual-Use Foundation Models.” In our comment, we offer several key recommendations: 1. Discuss additional non-misuse risks of AI, including loss of control. 2. Provide security recommendations to prevent model theft by advanced adversaries. 3. Expand guidance on model capability evaluations to include mid-training assessments, full capability elicitation, and third-party evaluations. 4. Provide additional guidelines for testing safeguard robustness, including automated methods. 5. Offer actionable suggestions for managing risks from deployment modalities, such as fine-tuning APIs and model weight access. 6. Include more detailed technical suggestions for implementing safeguards. 7. Suggest releasing AI safety frameworks with public commitments to evaluate and manage risks from dual-use foundation models. Read our full comment: https://lnkd.in/gPYRGcPR

Regulations.gov

regulations.gov

赞评论分享
METR

537 位关注者
2 个月
举报此动态
We were glad to help Magic develop their AGI Readiness Policy. We believe when AI developers publish publicly legible thresholds at which they believe they would currently be unprepared to safely develop their models, it's a valuable step towards making safety concrete.

AGI Readiness Policy — Magic

magic.dev

赞评论分享
METR

537 位关注者
2 个月
举报此动态
From September 3rd to 9th, we ran OpenAI's o1-preview on our suite of ML R&D/SWE/general agency tasks. Four days of scaffolding iteration took it from well below GPT-4o to on par with the highest-scoring public model (3.5 Sonnet). We expect substantial performance gains from more elicitation/finetuning. The o1-preview agent made nontrivial progress on 2 of 7 challenging AI R&D tasks (intended for skilled research engineers to take ~8h). For example, it was able to create an agent scaffold that allowed GPT-3.5 to solve coding problems in rust, and fine-tune GPT-2 for question-answering. For more discussion about the challenges of eliciting capabilities of o1-preview, qualitative impressions of the model, and reasons why we believe our evaluations likely underestimate the model’s capabilities, read our full report here: https://lnkd.in/gk44SPDh
赞评论分享
METR

537 位关注者
2 个月
举报此动态
Beth Barnes, METR's founder and head of research, has been included on TIME's list of top 100 most influential people in AI for 2024

TIME100 AI 2024: Beth Barnes

time.com

赞评论分享
METR

537 位关注者
2 个月
举报此动态
We’ve published a new document, Common Elements of Frontier AI Safety Policies, that describes the emerging practice for AI developer policies that address the Seoul Frontier AI Safety Commitments. We identify several components shared among Anthropic’s Responsible Scaling Policy, OpenAI’s Preparedness Framework, and Google DeepMind’s Frontier Safety Framework, with relevant excerpts from each policy. Full document here: https://lnkd.in/g9Gxu483

Common Elements of Frontier AI Safety Policies

metr.org

赞评论分享
METR

537 位关注者
5 个月
举报此动态
We are looking for ML research engineers/scientists to help drive AI R&D evals forward. Over the last few months, we’ve increased our focus on developing evaluations for automated AI research and development, because we think this capability could be extraordinarily destabilizing if realized. Learn more here:?https://lnkd.in/ghNtrEzp #hiring #AI #ML #MLEngineer #MLResearcher

ML Engineers Needed for New AI R&D Evals Project

metr.org

1 条评论

赞评论分享

相似主页

查看关于METR的洞察

METR

非盈利组织

Berkeley，CA 537 位关注者

关于我们

地点

METR员工

Ben West

Technical Staff @ METR

Kyle Scott

Operations at the Alignment Research Center

Kit Harris

Chief of Staff at METR

Max Hasin

Machine Learning Engineer

动态

立即加入，查看您错过的职场动态

相似主页

Alignment Research Center

Redwood Research

INKOMOKO

MapBiomas

Transcend

Global Methane Hub

EQUIPO ARGENTINO DE ANTROPOLOGIA FORENSE

Community Based Public Safety Collective

Alliance for Safety and Justice

Every Cure