登录查看更多内容

An update on OpenAI's safety & security practices

Tejash Mehta

Customer Success Leader.ARR Retention, Expansion Revenue & NPS. Proven Success in Scaling Teams & Accelerating Product Adoption.[Opinions or views expressed here are solely my personal opinions]

发布日期: 2024年11月25日

Disclaimer: Views shared are personal opinions, it does not reflect the policies, views and/or opinions of the company itself.

SSC will become an independent Board oversight committee, chaired by Zico Kolter, to oversee critical safety and security measures related to model development and deployment.

We established the Safety and Security Committee to make recommendations on critical safety and security decisions as OpenAI models reach each new level of capability and continue solving hard problems for people around the world. As one of its initial mandates, the Safety and Security Committee conducted a 90-day review of safety and security-related processes and safeguards and made recommendations to the full Board.?

Following the full Board's review, we are now sharing the Safety and Security Committee’s recommendations across five key areas, which we are adopting. These include enhancements we have made to build on our governance, safety, and security practices.?

Establishing independent governance for safety & security
Enhancing security measures
Being transparent about our work
Collaborating with external organizations
Unifying our safety frameworks for model development and monitoring

The Safety and Security Committee reviewed the safety and security criteria that OpenAI used to assess OpenAI o1’s fitness for launch as well as the results of safety evaluations of OpenAI o1. We’re committed to continuously improving our approach to releasing highly capable and safe models, and value the crucial role the Safety and Security Committee will play in shaping OpenAI's future.

1. Establishing Independent Governance for Safety & Security

The Safety and Security Committee will become an independent Board oversight committee focused on safety and security, to be chaired by Zico Kolter, Director of the Machine Learning Department with the School of Computer Science at Carnegie Mellon University, and including Adam D’Angelo, Quora co-founder and CEO, retired US Army General Paul Nakasone, and Nicole Seligman, former EVP and General Counsel of Sony Corporation. It will oversee, among other things, the safety and security processes guiding OpenAI’s model development and deployment.?

The Safety and Security Committee will be briefed by company leadership on safety evaluations for major model releases, and will, along with the full board, exercise oversight over model launches, including having the authority to delay a release until safety concerns are addressed. As part of its work, the Safety and Security Committee and the Board reviewed the safety assessment of the o1 release and will continue to receive regular reports on technical assessments for current and future models, as well as reports of ongoing post-release monitoring. The Safety and Security Committee will also benefit from regular engagement with representatives from OpenAI’s safety and security teams. Periodic briefings on safety and security matters will also be provided to the full Board.?

2. Enhancing Security Measures

Cybersecurity is a critical component of AI safety, and we’ve been a leader in defining the security measures that are needed for the protection of advanced AI. We will continue to take a risk-based approach to our security measures, and evolve our approach as the threat model and the risk profiles of our models change. We are pursuing expanded internal information segmentation, additional staffing to deepen around-the-clock security operations teams, and continued investment in ongoing initiatives to enhance the security of OpenAI’s research and product infrastructure.

As part of the Safety and Security Committee’s review, we identified additional opportunities for industry collaboration and information sharing to advance the security of the AI industry. For example, we are carefully evaluating the development of an Information Sharing and Analysis Center (“ISAC”) for the AI industry, to enable the sharing of threat intelligence and cybersecurity information among entities within the AI sector to enhance our collective resilience against cyber threats.

3. Being Transparent About Our Work

We will look for more ways to share and explain our safety work. We have long published system cards which outline the capabilities and risks of our models. The GPT-4o system card and o1-preview system card provided comprehensive details about the safety work carried out prior to releasing each model, including the results of external red teaming and scientific frontier risk evaluations(opens in a new window) within our Preparedness Framework, and an overview of the mitigations we built in to address key risk areas.?

4. Collaborating with External Organizations

We will explore more opportunities for independent testing of our systems and will lead the push for industry-wide safety standards. For example, we’re already developing new collaborations with third-party safety organizations and non-governmental labs for independent model safety assessments. We are also working with government agencies to advance the science of AI safety. This includes working with Los Alamos National Labs—one of the United States’ leading national laboratories—to study how AI can be used safely by scientists in laboratory settings to advance bioscientific research. Additionally, we recently reached agreements with the U.S. and U.K. AI Safety Institutes to collaborate on the research of emerging AI safety risks and standards for trustworthy AI.

5. Unifying Our Safety Frameworks for Model Development and Monitoring

Ensuring the safety and security of our models involves the work of many teams across the organization. As we’ve grown and our work has become more complex, we are building upon our model launch processes and practices to establish an integrated safety and security framework with clearly defined success criteria for model launches. This framework will be based on risk assessments and approved by the Safety and Security Committee. As models get more capable, this framework will adapt to manage increased complexity and risks. To begin executing on the framework, we reorganized research, safety, and policy teams to strengthen collaboration and ensure tighter connections across the company.

This is the exact update provided by OpenAI in this blogpost. Here is the source link

An update on OpenAI's safety & security practices

Tejash Mehta

Customer Success Leader.ARR Retention, Expansion Revenue & NPS. Proven Success in Scaling Teams & Accelerating Product Adoption.[Opinions or views expressed here are solely my personal opinions]

1. Establishing Independent Governance for Safety & Security

2. Enhancing Security Measures

3. Being Transparent About Our Work

4. Collaborating with External Organizations

5. Unifying Our Safety Frameworks for Model Development and Monitoring

Cerebyte

993 位关注者

更多精彩文章

1. Establishing Independent Governance for Safety & Security

2. Enhancing Security Measures

3. Being Transparent About Our Work

4. Collaborating with External Organizations

5. Unifying Our Safety Frameworks for Model Development and Monitoring

Cerebyte

993 位关注者

Yoshua Bengio will receive ￡59 million for The project, called Safeguarded AI

2024年10月21日

Many Safety Evaluations for AI Models Have Significant Limitations

2024年10月7日

AI ethics boards coming up fast in Indian tech majors.

2024年9月30日

OpenAI’s approach to spotting hallucinations: Use AI

2024年9月24日

OpenAI faces criticism over its whistleblower policies

2024年9月10日

A Right to Warn about Advanced Artificial Intelligence

2024年9月3日

Britain expands AI Safety Institute to San Francisco amid scrutiny over regulatory shortcomings

2024年8月20日

World’s first major law for artificial intelligence gets final EU green light.

2024年8月6日

OpenAI dissolves team focused on long-term AI risks

2024年7月23日

AI Safety Institute releases new AI safety evaluations platform

2024年7月9日