Most product safety in the modern world works after the fact. A product goes on sale, and if it turns out to be dangerous, it is recalled, sued over, or regulated in response. That loop, harm then correction, is how we handle almost everything, and it works because most harms are visible, survivable, and reversible.
Pharmaceuticals are the great exception. A drug company cannot release a medicine and wait for the bodies. It has to demonstrate safety and efficacy to the Food and Drug Administration through staged clinical trials before it may sell the product at all. The burden of proof sits with the developer, and it sits before deployment. The FDA model for AI asks whether frontier AI, which shares the features that made medicine an exception, should be governed the same way.
Why the analogy is attractive
The fit is closer than it first appears, because the reasons drugs get pre-market approval are reasons that apply to frontier AI.
- The burden of proof is reversed. The developer must show the product is safe, rather than a regulator or the public having to show it is dangerous after release. That inversion is exactly what the Foundation argues for elsewhere, in safety cases and scaling policies.
- Approval comes before deployment. The check happens while the system is still contained, not after it is loose in the world, which is the only point at which prevention beats reaction.
- Staged testing scales with risk. Clinical trials proceed in phases, each gate requiring evidence before the next. That maps naturally onto capability thresholds for AI.
For a technology where some failures may be severe and hard to reverse, a regime that refuses to let the highest-risk systems out until safety is affirmatively demonstrated is a serious and attractive proposition.
Where the model strains
The analogy is a guide, not a template, and the differences are instructive.
A drug is a fixed molecule doing a specific thing in a body, studied through a mature science with quantified risks. A frontier model is general-purpose, used for open-ended tasks nobody fully enumerated, with capabilities that can emerge unpredictably and behaviour that can shift after deployment through fine-tuning or new tools. You can define what it means for a blood-pressure drug to be safe. Defining what it means for a general reasoning system to be safe is the unsolved problem at the centre of the field, and the science an FDA-style reviewer would need often does not yet exist.
There is also the border problem. The FDA governs a national market with hard edges. A dangerous model can be trained anywhere and copied everywhere, so national pre-approval alone leaves the gap that only international coordination can close, backed by the physical chokepoints of compute governance.
The FDA model gives us the right principle, prove safety before release, and reminds us how much of the underlying safety science we still lack.
What to take from it
The FDA model is valuable less as a blueprint to copy than as a demonstration that pre-market approval is normal, workable, and accepted for products where the downside is too serious to handle by recall. Society already agrees, in the case of medicine, that some things must be proven safe before they reach us. Extending that settled principle to the frontier AI systems whose failures could be gravest is not a radical demand. It is the application of an existing norm to a new technology that plainly qualifies, delivered through the mix of domestic requirement and international framework set out in our plan.
Common questions.
It is the idea of regulating frontier AI the way pharmaceuticals are regulated: requiring developers to demonstrate that a system is safe before it may be deployed, rather than releasing it and correcting harms afterward. Just as a drug company must prove safety and efficacy through staged clinical trials before the Food and Drug Administration allows a medicine to be sold, the highest-risk AI systems would have to clear an approval process, with the burden of proof on the developer and the check happening before deployment.
Because the reasons drugs receive pre-market approval also apply to frontier AI. It reverses the burden of proof so the developer must show a system is safe rather than others having to show it is dangerous after release. It puts the check before deployment, while the system is still contained, which is the only point where prevention beats reaction. And its staged, phase-by-phase testing maps naturally onto capability thresholds. For a technology whose worst failures may be severe and hard to reverse, that structure is attractive.
A drug is a fixed molecule with a specific effect, studied through a mature science with quantified risks. A frontier model is general-purpose, used for open-ended tasks, with capabilities that can emerge unpredictably and behaviour that can change after release through fine-tuning or new tools. It is possible to define what safety means for a specific drug, but defining safety for a general reasoning system is the unsolved core problem of the field, so the science an FDA-style reviewer would need often does not yet exist. There is also a border problem: the FDA governs a national market, while a dangerous model can be trained anywhere and copied everywhere.
That pre-market approval is a normal, workable, and widely accepted approach for products whose downside is too serious to handle by recall. Society already agrees, in the case of medicine, that some things must be proven safe before reaching the public. Extending that settled principle to the frontier AI systems whose failures could be gravest is not radical; it is applying an existing norm to a technology that qualifies, though it requires international coordination and compute governance to close the gap that national approval alone would leave.