Most AI safety discussions focus on the scenario in which AI systems pursue goals that are obviously bad — goals that lead directly to human harm. Value lock-in is a subtler and in some ways more disturbing scenario: the permanent encoding of goals that seem good at the time of encoding, but foreclose the moral progress that would eventually have improved them.

The concept is straightforward. A superintelligent AI system, or an AI-empowered entity, gains sufficient control over the world's resources and systems to permanently enforce whatever values it has been given or has adopted. From that point forward, those values shape all outcomes, with no possibility of revision. The lock-in may happen deliberately (an actor encodes their values intentionally) or accidentally (a system maximizes its objective so thoroughly that it pre-empts any possibility of alternatives). Either way, the future becomes fixed.

The historical argument against any lock-in

The most powerful argument against value lock-in is not philosophical but historical. Look back two centuries and examine what the people of that time considered obvious moral truths. Slavery was legally and socially accepted in most of the world, including in nations with sophisticated philosophical traditions and explicit commitments to human dignity. Women's legal status as property of their husbands or fathers was universal. Practices now recognized as torture were standard elements of criminal justice systems. Children had no rights against their parents or employers.

The people of those eras were not uniquely malicious. Many were thoughtful, morally serious people who reasoned carefully about ethics. They arrived at positions that were catastrophically wrong by any contemporary standard. And they were certain their values were sound — the same moral certainty that made lock-in seem reasonable to many of them.

The pattern continues. A century ago, virtually all Western societies held explicit views about racial hierarchy that are now recognized as both factually false and morally grotesque. Fifty years ago, homosexuality was classified as a mental illness in major medical reference works. The moral errors of previous generations are obvious in hindsight. It would be remarkable if our own era had finally achieved moral perfection and had no comparable errors awaiting future correction.

The key implication

If current values contain errors — and the historical pattern suggests they do — then permanently encoding current values is permanently encoding those errors. The future generations who would have discovered and corrected those errors will have no mechanism to do so. The lock-in does not just preserve the good in our current values. It preserves everything in them, errors included, forever.

Why even benevolent lock-in is dangerous

A common response to value lock-in concerns is: "What if the values being locked in are genuinely good?" The response misses the problem. The assessment that values are "genuinely good" must itself be made using the values under consideration. There is no external standpoint from which to evaluate whether current values are good enough to justify permanent encoding. The same confident moral certainty that makes lock-in seem acceptable has historically been the feature that subsequent moral progress has found most in need of revision.

Nick Bostrom's astronomical waste argument makes this point in a different way. The future accessible to humanity, across cosmic timescales, involves an almost incomprehensible number of possible lives and experiences. Even a small systematic deviation from optimal values, applied across this scale, represents a vastly larger loss than anything that can happen within a human lifetime. Lock-in to values that are 99% correct by some ideal standard is still a catastrophic outcome when measured against the scale of what the future might contain.

The relationship to democratic oversight

Value lock-in is one of the central reasons why democratic and international oversight of superintelligent AI matters so much. Any single entity — a company, a government, even a well-intentioned foundation — whose values are encoded permanently into a superintelligent system has produced lock-in, regardless of how good those values appear at the time.

The alternative is not to identify the right values and encode those instead. No one can do this reliably, and the historical track record suggests high confidence about having found the right values is itself a warning sign. The alternative is to preserve the conditions under which moral progress can continue: pluralism, diversity of perspectives, mechanisms for peaceful revision of norms, and no single entity with the power to permanently impose their values on all others.

This is the structural argument for democratic governance of superintelligent AI that goes beyond the instrumental argument (it produces better outcomes). Democratic oversight preserves the conditions for continued moral progress, regardless of which specific values are held at any moment. The Foundation's governance proposals are built around this insight: the goal is not to ensure the right values are in control, but to ensure no single set of values — however well-intentioned — gains permanent, irreversible control.

Common questions.

What is value lock-in?

The scenario in which a superintelligent AI permanently encodes a fixed set of values into the future, such that those values shape all outcomes from that point forward with no possibility of revision. Lock-in can happen deliberately (an actor encodes their values) or accidentally (a system maximizes its objective so thoroughly it pre-empts alternatives). The danger is not only that the values being encoded might be bad — it is also that encoding any fixed set of values forecloses the moral progress that has historically improved human values.

Why is value lock-in dangerous even if the values being locked in seem good?

Because the assessment that values are good must be made using those very values — there is no external standpoint from which to verify that current values are good enough to justify permanent encoding. History shows that the confident moral certainty of every era has contained serious errors subsequently corrected by moral progress. Our current era almost certainly contains comparable errors not yet recognized. Permanent encoding locks in those errors with no mechanism for future correction.

Is value lock-in the same as totalitarianism?

Related but distinct. Totalitarianism is political and social control by a single authority; it is historically reversible and has been reversed many times. Value lock-in specifically refers to an AI-enabled permanent encoding of values in a way that cannot be reversed even by future generations with different values. A totalitarian regime can be overthrown; a sufficiently capable AI system that has locked in a set of values may be impossible to reverse because it can prevent and pre-empt any challenge. The irreversibility is what distinguishes lock-in from ordinary political domination.

What prevents value lock-in?

Preventing value lock-in requires ensuring that no single entity gains sufficient control over superintelligent AI to permanently encode their values — regardless of how good those values appear. This means international governance frameworks that distribute oversight across multiple parties, accountability mechanisms that preserve the ability to challenge and revise AI-enforced norms, and structural protections against any single actor achieving the kind of control that would enable lock-in. Democratic and international oversight of superintelligent AI is, at its core, a mechanism for preventing value lock-in.