An AI Safety Institute is a government body set up to evaluate advanced AI systems for risks to national security and public safety, and to build the state's own expertise on frontier AI. The first were announced around the 2023 international AI summit, and several major governments now run one or something equivalent, linked through an international network that shares methods and findings.

The core idea is straightforward and overdue. For years the only organisations that understood frontier models deeply were the companies building them. An AI Safety Institute is an attempt to put independent technical capability inside government, so that public authorities can assess these systems rather than take a developer's word for it.

What they actually do

Their work clusters into a few areas.

  • Testing frontier models, sometimes before release, for dangerous capabilities in domains like cyber, biology, and autonomy, using the methods behind capability evaluations.
  • Developing the science of evaluation itself, since how to measure these risks well is an unsolved research question, not a settled procedure.
  • Advising government, feeding technical assessments into policy so that decisions are informed by people who have actually examined the systems.
  • Coordinating internationally, so that a model tested in one country does not have to be re-tested from scratch everywhere, and so that standards begin to converge.

This is real institutional progress, and the Foundation welcomes it. Building state capacity to understand frontier AI is a precondition for governing it. You cannot regulate what you cannot evaluate, and until recently governments largely could not.

The power they mostly lack

Here is the catch. Most AI Safety Institutes can test, advise, and publish. Few can compel. Their access to models often depends on the cooperation of the labs, their findings usually inform rather than bind, and in most cases they cannot order a dangerous model held back. They are, by and large, bodies that can see the risk and recommend a response, not require one.

That gap between assessment and authority is the crucial limitation. An institute that discovers a serious hazard and can only advise is only as effective as the government's willingness to act on the advice, against the commercial and competitive pressure to let development continue. It is the difference between a smoke detector and a fire brigade, and much of the current architecture is smoke detectors.

Testing without the power to act on the results is a warning system, not a safeguard.

What they could become

The value of AI Safety Institutes is that they build the thing binding governance would need: independent technical capacity, shared standards, and an international channel between governments. In that sense they are the embryo of the kind of body the Foundation argues for. Our piece on an international monitoring agency describes where this could lead, and the IPCC model shows how shared scientific assessment can underwrite international policy.

What has to change is the authority. Assessment needs to connect to enforcement, whether through domestic law that makes an institute's sign-off a condition of deployment, or through an international framework that gives verified findings real consequences. Give these institutes the power to match their expertise, and the scaffolding of serious governance is already partly built. That transition is the subject of our plan.

Common questions.

What is an AI Safety Institute?

An AI Safety Institute is a government body established to evaluate advanced AI systems for risks to national security and public safety and to build the state's own technical expertise on frontier AI. The first were announced around the 2023 international AI summit, and several governments now operate one or an equivalent, linked through an international network that shares evaluation methods and findings.

What do AI Safety Institutes do?

They test frontier models, sometimes before release, for dangerous capabilities in areas like cyber, biology, and autonomy; they develop the science of how to evaluate such risks, which is still an open research problem; they advise government by feeding technical assessments into policy; and they coordinate internationally so that testing and standards begin to converge across countries. In short, they give public authorities independent capability to assess systems rather than relying on developers' own claims.

What powers do AI Safety Institutes have?

Most can test, advise, and publish, but few can compel. Their access to models often depends on the labs' cooperation, their findings typically inform rather than bind, and in most cases they cannot order a dangerous model to be withheld. They are largely bodies that can identify a risk and recommend a response, not require one, which means their effectiveness depends heavily on whether governments choose to act on their advice.

Why does the gap between assessment and authority matter?

Because an institute that uncovers a serious hazard but can only advise is only as effective as the government's willingness to act against commercial and competitive pressure to keep development going. Testing without the power to act on the results functions as a warning system rather than a safeguard. Connecting assessment to real enforcement, through domestic law or an international framework that gives verified findings consequences, is what would turn these institutes from smoke detectors into an effective safeguard.