A responsible scaling policy, or RSP, is a framework a frontier AI lab publishes to govern its own development. The structure is conditional: define levels of dangerous capability, and commit in advance to specific safeguards that switch on when a model reaches each level. The best-known versions come from the leading labs, and the format has spread. You will also see the underlying logic called if-then commitments.

The pattern looks like this. Capability tiers are named, roughly from present systems up through models that could meaningfully assist with a bioweapon or a serious cyberattack or that show dangerous autonomy. Each tier carries required protections: harder security to stop model weights being stolen, deployment restrictions, red-teaming, and, at the higher tiers, commitments not to train or release a model until specified safeguards are demonstrably in place.

Why this is a real improvement

Compared with a general promise to take safety seriously, an RSP is a step forward, and it is fair to say so.

It is concrete. It names capabilities, thresholds, and responses in advance, which turns a vague commitment into something you can hold an organisation against. It is anticipatory, forcing a lab to decide what it will do about a dangerous capability before it has one, rather than in the rush after. And it connects capability to consequence through dangerous capability evaluations, giving the thresholds an operational meaning. Some RSPs go as far as committing to pause development if safeguards cannot keep pace with capability, which is a meaningful thing to put in writing.

The problem sitting underneath

The difficulty is structural, and it is the same one that shadows every form of industry self-governance. The lab writes the policy, defines the thresholds, runs the evaluations, judges whether a threshold was crossed, and decides whether its safeguards are adequate. It can also revise the policy. When commercial pressure to ship collides with a commitment the lab authored and administers itself, there is no independent party with the authority to hold the line.

History does not favour the optimistic reading of that arrangement. The record of voluntary corporate safety commitments under competitive pressure, examined in our piece on voluntary commitments, is a record of standards that soften exactly when they start to bite. An if-then commitment is only as strong as the then, and the then is enforced by the same party that benefits from weakening it.

A rule you set for yourself, measure yourself, and can amend yourself is a statement of intent, not a constraint.

Two further gaps matter. RSPs rely on evaluations to detect when a threshold is reached, so everything that limits evaluations, sandbagging, unknown capabilities, the difficulty of proving a model is safe, limits the policy that rests on them. And an RSP binds only the lab that adopts it. A competitor that declines, or a state programme outside this culture entirely, is untouched, which is the coordination problem that no single company can solve alone.

How the Foundation reads them

Responsible scaling policies are worth having and worth strengthening, and they are not a substitute for external governance. The right way to see them is as a draft of what enforceable rules could contain: the threshold logic, the if-then structure, the link to evaluations are all reusable. What has to change is who holds the pen and who enforces the pause. Move the thresholds and the triggers from voluntary policy into binding law, backed by an independent body and by compute governance, and the good ideas in RSPs become commitments a lab cannot quietly revise when they become inconvenient. That transition is what our plan is for.

Common questions.

What is a responsible scaling policy?

A responsible scaling policy, or RSP, is a framework an AI lab publishes to govern its own development using an if-then structure: it defines levels of dangerous capability and commits in advance to specific safeguards that activate when a model reaches each level. Safeguards range from stronger security and deployment restrictions to commitments not to train or release a model until certain protections are in place. The approach is also described as if-then commitments.

Why are responsible scaling policies considered an improvement?

Because they are more concrete and anticipatory than a general promise to take safety seriously. They name specific capabilities, thresholds, and responses in advance, which makes them something an organisation can be held against, and they force a lab to decide how it will handle a dangerous capability before it has one. They also tie thresholds to dangerous capability evaluations, and some commit to pausing development if safeguards cannot keep up with capability.

What is the main weakness of responsible scaling policies?

They are self-regulation. The same lab writes the policy, sets the thresholds, runs the evaluations, judges whether a threshold was crossed, decides whether its safeguards suffice, and can revise the policy. When commercial pressure to ship conflicts with a self-authored commitment, no independent party has the authority to enforce it. The history of voluntary corporate safety commitments under competitive pressure suggests such standards tend to soften just when they would otherwise bite.

How could responsible scaling policies be made stronger?

By moving their best ideas, the capability thresholds, the if-then structure, and the link to evaluations, from voluntary policy into binding law enforced by an independent body, and by backing them with compute governance. That keeps the useful structure while changing who holds the pen and who enforces the pause, so that a lab cannot quietly revise its commitments when they become inconvenient, and so that the rules bind competitors and state programmes rather than only the lab that volunteered.