The International Atomic Energy Agency was established in 1957 with a mandate that sounded simple: promote peaceful uses of nuclear energy while ensuring it was not diverted to weapons programs. The verification system it built over the following decades became something that had no real precedent in international law — a functioning system for monitoring the most dangerous technology on earth, operated by an international body, accepted by the most powerful states in the world.
Every serious proposal for international AI governance is now studying the IAEA model. The question is not whether the nuclear verification architecture can be copied wholesale into the AI context. It cannot. The question is which elements transfer, which need to be redesigned, and which problems have no nuclear equivalent at all.
What verification actually means
Verification is not inspection in the ordinary sense. It is a system for creating confidence. The goal is not primarily to catch violations but to make violations sufficiently costly and visible that they are not attempted in the first place. This distinction matters because it sets the right standard for what success looks like: a verification system that never discovers violations may be succeeding, not failing.
The IAEA approach rests on four components that work together. The first is state declarations. Countries that join the Treaty on the Non-Proliferation of Nuclear Weapons are required to declare all nuclear material and facilities to the IAEA. These declarations become the baseline against which all subsequent monitoring is measured. A country that fails to declare a facility has not simply failed an inspection; it has violated the treaty's foundational requirement before the first inspector arrives.
The second is safeguards agreements. Each NPT signatory signs a bilateral safeguards agreement with the IAEA specifying the agency's inspection rights, the information the state must provide, and the consequences of non-compliance. The Additional Protocol, introduced after the discovery of Iraq's undeclared nuclear program in the early 1990s, significantly expanded these rights. It allows the IAEA to inspect locations that were not declared but that the agency has reason to inspect, based on information from any source.
The third is routine inspection. IAEA inspectors conduct scheduled visits to declared nuclear facilities. Inspectors use tamper-indicating seals, remote cameras, and material samples to monitor facilities between visits. A discrepancy between the declared inventory and what the inspector finds is a significant finding requiring explanation.
The fourth component, and the one most relevant to AI governance, is environmental sampling. Since the mid-1990s, the IAEA has used environmental monitoring — air, water, and soil samples — to detect evidence of undeclared nuclear activities. Trace amounts of specific isotopes in environmental samples can indicate nuclear processing that was never declared, regardless of whether inspectors were ever admitted to the relevant facility. This is indirect physical evidence of an activity, derived from the activity's unavoidable physical traces.
The limits of the nuclear model
The nuclear verification regime has real accomplishments. Far fewer countries have nuclear weapons than the most pessimistic forecasts of the 1960s predicted. The NPT has constrained proliferation significantly, if imperfectly. But the cases where it failed are as instructive as the cases where it succeeded.
Iraq's nuclear weapons program was advancing in the 1980s and early 1990s while Iraq was an NPT signatory and declared no weapons program. The program was revealed by the Gulf War and subsequent UNSCOM inspections, not by routine IAEA monitoring. The IAEA inspectors had visited Iraq's declared facilities and found nothing irregular, because Iraq's weapons program was entirely separated from its declared civilian program. North Korea withdrew from the NPT in 2003 after the verification regime detected discrepancies it could not explain; the country now has nuclear weapons. Iran's undeclared enrichment program was discovered through national intelligence and disclosed to the IAEA by a dissident group in 2002, not through routine monitoring.
What these cases share is not that verification failed in a technical sense but that determined state actors made a deliberate political decision to absorb the costs of treaty violation. Verification changes the cost-benefit calculation; it does not eliminate it. A country that decides the weapons program is worth the political and economic consequences of discovery will conduct that program regardless of verification mechanisms. The IAEA's job is to make the discovery more likely and the political consequences more severe, not to make violation impossible.
What transfers to AI governance
Several elements of the nuclear verification model translate to AI governance with meaningful adaptation.
The declaration requirement transfers directly. A frontier AI governance regime could require states to declare large-scale AI development programs above specified computational thresholds. The declaration creates a legal baseline: subsequent discrepancies between what was declared and what is observed are treaty violations, not merely policy disagreements. This shifts the burden of proof. Without declarations, a monitoring body must demonstrate that undeclared activity occurred. With declarations, a state must explain discrepancies between its declarations and the observed evidence.
The safeguards agreement model also transfers. A bilateral AI governance agreement between a monitoring body and a signatory state would specify the monitoring body's access rights, the information the state must provide, and the consequences of non-compliance. This architecture is well-understood in international law and would not require significant legal innovation.
Environmental sampling has an AI analogue in compute monitoring. Semiconductor manufacturing is geographically concentrated and requires specialized equipment. Advanced AI chips leave supply chain traces: they are manufactured in a small number of facilities, exported under customs documentation, and installed in identifiable data center configurations that have specific power signatures. None of these is as unambiguous as an isotope signature, but together they create indirect physical evidence of AI development activity that does not depend entirely on state self-reporting.
The Additional Protocol innovation is directly applicable. The expansion of inspection rights following Iraq's program demonstrated that verification regimes can be reformed after failures without abandoning the treaty framework. AI governance treaties will encounter early compliance failures. Building in a mechanism to expand monitoring rights in response to detected violations, rather than treating violations as grounds for sanctions alone, is a lesson that took the nuclear regime decades to learn.
What is genuinely different about AI
Nuclear material is physically distinct and relatively rare. Weapons-grade uranium and plutonium can be detected through their isotopic signatures in ways that are extremely difficult to falsify. A uranium enrichment facility requires specific industrial equipment that leaves environmental traces. The gap between civilian nuclear applications and weapons-relevant nuclear activity, while bridgeable by a determined state, is measurable and verifiable in ways that support monitoring.
AI development has no equivalent distinction. The same hardware, software infrastructure, and engineering talent that produces consumer AI applications can produce frontier AI systems with potential for civilizational impact. A large compute cluster running commercial AI services and a large compute cluster conducting frontier AI research look identical from the outside. The relevant distinction is in what they are trained to do and at what scale, not in any physical property of the facility or the equipment.
This difference is real and important. It means AI governance verification will need to rely on a combination of hardware monitoring at the chip manufacturing level, compute cluster reporting, energy consumption data, and software audit rights, rather than isotope signatures. It also means the relevant threshold question — above what scale of AI development does a monitoring obligation apply — is considerably harder to specify than the nuclear equivalent, and will require regular technical revision as compute costs change.
These are design problems, not reasons to conclude that verification is impossible. The IAEA of 1957 looked very different from the IAEA of 1997, after the Additional Protocol reforms that followed Iraq's program. The monitoring methods used in the 1960s bore little resemblance to the environmental sampling capabilities of the 1990s. Verification regimes are built incrementally, through failures as much as through successes, and they improve over time if the political will to maintain and reform them is sustained.
"The goal of arms control is not certainty. It is raising the cost of violation high enough that most actors decide it is not worth it, most of the time."
Naoto Nakada, Founder · Nakada Foundation to Save Humanity
The question for AI governance is not whether the nuclear verification model can be transplanted. It cannot. The question is whether a verification system adequate to the AI governance challenge can be designed using the same underlying logic: declarations, safeguards agreements, inspection rights, and technical monitoring of physical evidence. The IAEA's 70-year history suggests the answer is yes, provided the political will to build and sustain such a system exists. That political will, not technical design, is the binding constraint.
Common questions.
The IAEA is the International Atomic Energy Agency, established in 1957. It verifies compliance with nuclear non-proliferation commitments through state declarations of all nuclear material and facilities, bilateral safeguards agreements specifying inspection rights, routine inspections of declared facilities using seals and sampling, and environmental monitoring using isotope analysis to detect undeclared activity. The Additional Protocol, adopted after Iraq's undeclared program was discovered, expanded inspection rights to cover locations not in initial declarations when there is reason to suspect undeclared activity.
Partially. Far fewer countries have nuclear weapons than analysts predicted in the 1960s, when some forecasts expected 30 or more nuclear-armed states by 1990. The NPT and IAEA verification regime played a significant role in that outcome. Where the system failed, as in Iraq before 1991 and North Korea, state actors made a deliberate decision to absorb the political costs of non-compliance in exchange for weapons capability. Verification changes the cost-benefit calculation; it does not eliminate it.
Most serious proposals focus on three elements: compute monitoring (requiring chip manufacturers and exporters to report sales above specified thresholds), training run reporting (requiring notification to a monitoring body before or during large training runs), and facility inspection (the right to visit large compute clusters to verify reported activity matches actual activity). No single element is equivalent to isotope detection, but together they create a system of declared obligations with external verification.
Several reasons. The first binding AI treaty was signed in 2024 and addresses present-day AI harms rather than frontier AI development. The major AI-developing nations have not yet negotiated a framework that includes verification obligations. There is genuine technical uncertainty about the right monitoring thresholds and methods. The nuclear monitoring regime took decades of sustained political effort and several failures before reaching its current form. AI governance is at the beginning of that process.