Frontier risk and preparedness

To mitigate these risks as AI models continue to advance, we are forming a new team called Preparedness. Led by Aleksander Madry, the Preparedness team will closely connect capability assessment, evaluations, and internal red teaming for cutting-edge models, from the models we create in the near future to those with AGI-level capabilities. The team will assist in tracking, evaluating, predicting, and safeguarding against catastrophic risks across various categories including:

Individualized persuasionCybersecurityChemical, biological, radiological, and nuclear (CBRN) threatsAutonomous replication and adaptation (ARA)

The Preparedness team’s mission also involves developing and maintaining a Risk-Informed Development Policy (RDP). Our RDP will outline our approach to developing rigorous capability evaluations and monitoring for cutting-edge models, establishing a range of protective measures, and creating a governance structure for accountability and oversight throughout the development process. The RDP is designed to complement and expand our existing risk mitigation efforts, which contribute to the safety and alignment of new, highly capable systems, both pre and post-deployment.

Source link