AISI is bringing together AI companies and researchers for an
invite-only conference to accelerate the design and
implementation of frontier AI safety frameworks. This post
shares the call for submissions that we sent to conference
attendees.
—
Sep 19, 2024
Sections
At the
AI Seoul Summit, 16 global AI companies
committed
to develop and publish safety frameworks for frontier AI
systems before the French AI Action Summit in February 2025.
These frameworks aim to define thresholds above which frontier
AI systems would impose intolerable risk on society and to
spell out how they intend to identify and mitigate risk to
keep it below the thresholds, in a transparent, accountable
manner. Committing to producing such frameworks is a laudable
step.
However, this was only the first step. The science of AI and
AI safety is young. To advance this science, we are organizing
a conference, as a "Road to France" event, that will
convene experts from signatory companies and from research
organisations to discuss the most pressing challenges in the
design and implementation of frontier AI safety frameworks.
The conference will take place on 21-22 November in the San
Francisco Bay Area and is co-organized between the UK’s AI
Safety Institute and the Centre for the Governance of AI. It
is an invite-only conference for approximately 100 attendees.
In advance of the conference, we are asking attendees to
provide submissions discussing AI safety frameworks. To
provide transparency on the conference’s agenda, we are
sharing the call for submissions below. Closer to the
conference, we also intend to publish a lightly edited set of
conference proceedings, including most submissions.
Our Call for Submissions
From all conference invitees, we are welcoming submissions on
the following topics concerning the
Frontier AI Safety Commitments:
How safety frameworks can be developed and improved:
Existing and draft safety frameworks:
Signatories of the commitments are welcome to present their
current or draft safety frameworks (or parts thereof) to
solicit feedback and discussion.
Improving existing safety frameworks: How
can existing safety frameworks be strengthened? How can we
adapt best practices from other industries?
Building on safety frameworks (Commitment
V): How will safety frameworks need to change over time as
AI systems' capabilities improve? How do they need to
change when AI systems become capable of posing intolerable
levels of risk?
Support for AI safety frameworks: What are
common challenges for companies that are yet to produce a
frontier AI system and/or a safety framework? What kinds of
resources would they find helpful? How can governments,
academia, companies and civil society, and other third
parties support them better?
Approaches to achieving Outcome 1: “Organisations effectively identify, assess and manage risks
when developing and deploying their frontier AI models and
systems.”
Model evaluation (Commitment I): What is
the role of evaluations in safety frameworks? What does
current best practice look like? In what ways might current
evaluations fail to identify risks? Where is the science of
evaluation lacking and where should it go from here?
Threat modelling and risk assessment
(Commitment I): How should frontier AI developers assess
risks from their systems? How should evaluation results feed
into risk assessment? What additional factors, beyond
evaluation results, should play a role in risk assessments?
How large are the different risks currently?
Thresholds for intolerable risk (Commitment
II): How are thresholds in safety frameworks defined today
and how could they be improved? How can the adequacy of
these thresholds be validated and assessed? Should
thresholds focus exclusively on model capabilities or also
take other factors into account? Who should set thresholds
and why?
Model deployment risk measures (Commitments
III-IV): What techniques are being used to reduce the risks
from models during development and around deployment? How
can these measures be improved? Are there new technologies
that need to be developed? How can the adequacy and
robustness of safeguards be assessed?
Cybersecurity measures (Commitments
III-IV): What measures can be put in place to secure model
weights? What is the risk of model exfiltration? Which
models warrant what level of security measures? How
effective and costly are different security measures?
Approaches to achieving Outcome 2:“Organisations are accountable for safely developing and
deploying their frontier AI models and systems.”
Internal governance and risk management processes to
deliver safety frameworks
(Commitment VI): What internal governance and risk
management processes are currently being implemented by
organisations? Where can they be improved? What can be
learned from practices in other industries?
Approaches to achieving Outcome 3: “Organisations’ approaches to frontier AI safety are
appropriately transparent to external actors, including
governments.”
External accountability and transparency (Commitments VII-VIII): What information should be shared
publicly? What should be shared to select groups? What
should be shared to governments only? What procedures and
infrastructure could be put in place to reduce security and
confidentiality concerns?
Third-party scrutiny (Commitment VIII):
What external model access is currently being provided for
frontier systems? How can we further foster the development
of the third-party evaluation and auditing industry? What
technology and institutions need to be developed to
facilitate more robust external scrutiny? What chances to
input to the development of safety frameworks do external
actors have?