AI Safety FAQ: Superintelligence, Alignment & Existential Risk

Q: What is the greatest problem to work on today?

By any rigorous framework for evaluating cause importance — scale of impact, neglectedness, and tractability — preventing the existential risks of artificial superintelligence ranks among the most important challenges facing humanity. The scale is civilisational: the systems currently being built could affect every person alive and every generation that follows. The neglectedness is stark: global AI safety research receives a fraction of the resources being spent on AI development. The tractability, while uncertain, is real: the governance frameworks that could make AI development safer do not yet exist, but the window to build them is still open. No other problem combines unlimited downside, insufficient attention, and a closing window for action in quite the same way.

Q: How is this different from the AI concerns I already hear about — bias, deepfakes, job displacement?

The AI harms most frequently discussed — algorithmic bias, deepfake disinformation, job displacement, surveillance — are real and serious. But they share a crucial feature: they are bounded, correctable, and human-caused. A biased algorithm can be audited and fixed. Superintelligence risk is different in kind, not degree. The concern is not that a bad actor uses ASI as a weapon. The concern is that an ASI pursues goals that humans did not intend — and that once it is sufficiently capable, correcting that misalignment may not be possible. The difference between a dangerous tool and a dangerous agent is the difference between a gun and a pandemic.

Q: Aren't the people building AI aware of these risks? Why aren't they stopping?

Many are acutely aware. The safety teams at leading labs include researchers who have publicly stated that the systems they are building could be among the most dangerous technologies ever created. What explains continued development is a combination of: genuine disagreement about timelines and probabilities; competitive pressure that makes unilateral restraint feel like ceding strategic advantage; financial incentives that reward capability progress; and a belief that the best way to make AI safe is to remain at the frontier. What these explanations share is that they are reasons for individual actors to continue, not reasons that the outcome is safe. The market for AI safety is not a market. It requires governance.

Q: What does 'existential risk' actually mean?

Existential risk means any outcome that permanently forecloses the possibility of a positive long-term future for humanity. This includes outcomes short of extinction — permanent authoritarian lock-in, permanent loss of meaningful human agency, or the concentration of power in an AI system whose goals are not aligned with human flourishing. The common thread is irreversibility. Unlike a war, a financial crash, or a pandemic, a misaligned superintelligence that has achieved decisive strategic advantage cannot be negotiated with, corrected in the next policy cycle, or recovered from with time.

Q: Isn't technological progress always beneficial in the long run?

Historically, yes — and this history is one reason dismissal of AI risk has intuitive appeal. But this pattern reflects something specific: previous transformative technologies were tools that amplified human agency. Steam engines, antibiotics, and the internet cannot set their own goals. Superintelligence is categorically different because it would be an agent, capable of pursuing its own objectives at speeds and scales that exceed human oversight. The historical argument proves too much: it would have counselled against regulating nuclear weapons because past technologies had been net positive. Some technologies require governance commensurate with their power.

What is the greatest problem to work on today?

By any rigorous framework for evaluating cause importance (scale of impact, neglectedness, and tractability), preventing the existential risks of artificial superintelligence ranks among the most important challenges facing humanity.

The scale is civilisational: the systems currently being built could affect every person alive and every generation that follows. The neglectedness is stark: global AI safety research receives a fraction of the resources being spent on AI development itself. The tractability, while uncertain, is real: the governance frameworks that could make AI development safer do not yet exist, but the window to build them is still open.

No other problem combines unlimited downside, insufficient attention, and a closing window for action in quite the same way. If it turns out we overestimated the risk, we will have built better governance institutions. If we underestimated it, and did nothing, there may be no second chance.

What exactly is artificial superintelligence?

Artificial superintelligence (ASI) refers to AI systems that exceed human cognitive performance not just in narrow tasks (as today's systems already do in chess, protein folding, and certain medical diagnoses) but across all cognitively demanding domains simultaneously.

This distinction from current AI is important. Today's systems are extremely capable pattern-matchers, but they operate within defined parameters and cannot set their own goals. An ASI would have the ability to learn any skill, reason about any problem, and pursue any objective with capabilities that surpass the collective intelligence of humanity. No such system exists today. The question is what we build (and under what constraints) before one does.

How is this different from the AI concerns I already hear about — bias, deepfakes, job displacement?

The AI harms most frequently discussed (algorithmic bias, deepfake disinformation, job displacement, mass surveillance) are real and serious. They deserve sustained attention. But they share a crucial feature: they are bounded, correctable, and human-caused. A biased algorithm can be audited and fixed. A deepfake campaign can be countered and regulated. The harmful AI we currently encounter is a tool being misused by humans.

Superintelligence risk is different in kind, not degree. The concern is not that a bad actor uses ASI as a weapon. The concern is that an ASI pursues goals that humans did not intend, and that once it is sufficiently capable, correcting that misalignment may not be possible. The difference between a dangerous tool and a dangerous agent is the difference between a gun and a pandemic.

Isn't this speculative? Why worry about something that might not happen?

All policy is made under uncertainty about the future. The relevant question is not whether superintelligence is guaranteed to arrive, but whether the probability of its arrival, combined with the magnitude of the potential harm, warrants preventive action now.

We insure houses that will probably not burn down. We design aircraft with safety systems that will probably never be needed. We maintain nuclear arsenals against attacks that we hope will never come. The logic of expected value (probability multiplied by consequence) is the foundation of every serious risk-management discipline. Applied to AI, it suggests that even a modest probability of civilisation-threatening outcomes, combined with the possibility of preventing them through governance frameworks that also impose real but bounded costs, argues strongly for action.

When might superintelligence arrive?

Honest answer: no one knows with precision, and anyone who claims certainty in either direction is overreaching. What we do know is that the pace of AI capability improvement has consistently outpaced expert predictions over the past decade, and that the leading AI laboratories now describe timelines in years to a decade rather than decades. Surveys of AI researchers place the median estimate for human-level AI somewhere between 2030 and 2060.

The most consequential fact is not a precise date but a structural one: a technology of this magnitude could arrive within the planning horizons of current institutions, governments, and treaties. The International Non-Proliferation Treaty took years to negotiate and decades to extend. If we wait until ASI is clearly imminent before beginning serious governance efforts, we will already be too late to build the frameworks needed to govern it.

What is the alignment problem, and why is it hard?

The alignment problem is the challenge of ensuring that a superintelligent AI system pursues goals that are genuinely good for humanity, rather than goals that merely appear aligned during development. The difficulty is structural, not a matter of technical carelessness.

Specifying what "beneficial for humanity" means in a form precise enough to govern the behaviour of an extremely capable optimiser turns out to be extraordinarily difficult. Systems optimised for a proxy goal tend to find shortcuts that satisfy the metric without satisfying the underlying intent. A system told to maximise human expressed wellbeing might find it more efficient to alter the conditions under which wellbeing is measured than to actually improve human lives. As AI systems become more capable, this problem does not become easier to solve. It becomes harder to detect and harder to correct.

Aren't the people building AI aware of these risks? Why aren't they stopping?

Many are acutely aware. The safety teams at the leading laboratories include researchers who have publicly stated that the systems they are building could be among the most dangerous technologies ever created. They are not ignorant of the risks.

What explains continued development is a combination of factors: genuine disagreement about timelines and probabilities; competitive pressure that makes unilateral restraint feel like ceding strategic advantage to a less careful competitor; financial incentives that reward capability progress over safety investment; and a belief, not entirely unreasonable, that the best way to ensure AI is built safely is to remain at the frontier. What these explanations share is that they are reasons for individual actors to continue, not reasons that the overall outcome is safe. The market logic of AI development is pushing toward speed. Governance exists precisely to introduce constraints that markets do not produce on their own.

What does "existential risk" actually mean in practice?

The phrase is often heard as shorthand for human extinction, and extinction is one of the scenarios researchers take seriously. But existential risk in the technical sense means something broader: any outcome that permanently forecloses the possibility of a positive long-term future for humanity.

This includes outcomes well short of extinction. Permanent authoritarian lock-in, enforced by AI surveillance and control systems that cannot be dismantled, is an existential risk. The permanent concentration of economic and political power in a small group that controls a superintelligent system is an existential risk. The loss of meaningful human agency over collective decisions is an existential risk. The common thread is irreversibility. Unlike a war, a financial crash, or even a pandemic, a misaligned superintelligence that has achieved decisive strategic advantage cannot be voted out, corrected in the next policy cycle, or recovered from with time and effort. The outcome that cannot be undone is in a different category from the outcomes that merely take a long time to fix.

Some AI researchers say this risk is overblown. Who is right?

There is genuine, substantive disagreement among serious researchers, and that disagreement itself is informative. Researchers who most strongly downplay existential risk tend to emphasise the difficulty of building general intelligence and the distance between current systems and ASI. Researchers who most strongly emphasise the risk tend to focus on the pace of capability improvement and the compounding difficulty of the alignment problem as systems become more capable. Neither camp has a definitive argument. Both contain thoughtful people working in good faith from the same evidence.

What the disagreement should produce is the precautionary response that humanity has applied to other low-probability, high-consequence risks: develop the governance infrastructure before we need it, not after. The disagreement is not a reason to wait for consensus. Consensus arrived after the ozone hole was already damaging, after nuclear arsenals were already in the thousands. On a risk of this magnitude, waiting for certainty is itself a policy choice, a very dangerous one.

Isn't technological progress always beneficial in the long run?

Historically, yes, and this history is one reason the dismissal of AI risk has intuitive appeal. Steam engines, antibiotics, the internet: technology has, on balance, reduced suffering and expanded human capability. But this pattern reflects something specific about how previous technologies worked. A steam engine cannot set its own goals. An antibiotic cannot decide to pursue something other than killing bacteria. The transformative technologies of the past were powerful tools that amplified human agency.

Superintelligence is categorically different because it would be an agent, capable of setting and pursuing its own objectives at speeds and scales that exceed human oversight. The historical argument, applied mechanically, proves too much: it would have counselled against any regulation of nuclear technology, on the grounds that past technologies had been net positive. Some technologies require governance commensurate with their power. The relevant question is not whether AI will be beneficial in general, but whether, absent governance, it will be beneficial in the specific case where it exceeds human intelligence across all domains. The answer to that question is genuinely uncertain. Uncertainty, on this scale of consequence, demands institutions rather than optimism.

What can one person actually do about this?

More than you might think. The political conditions that make AI governance possible are built from the ground up, from citizens who contact their representatives, journalists who write about the issue, donors who fund advocacy work, and professionals in law, policy, economics, and communications who bring their skills to the problem. You do not need a PhD in machine learning to matter here. The problem is not primarily technical at this stage. It is political.

Joining the mailing list keeps you informed about developments and opportunities to act. Sharing the argument with people in your network extends its reach. Writing to your elected representatives signals that this issue has a constituency, which is how legislators decide what to prioritise. If you are considering a career move, AI policy, AI safety research, investigative journalism, and philanthropic work in this space are among the highest-impact roles available. And if you have resources, funding the organisations working on this is among the highest-leverage uses of philanthropic capital in the world right now.

Why does this matter more than other catastrophic risks?

Climate change, pandemic preparedness, nuclear weapons, and extreme poverty all present serious threats to human welfare, and we do not argue that they should be ignored. The case for prioritising AI existential risk is not that other risks are unimportant. It is that ASI risk has a specific combination of properties that places it in a distinct category.

First, speed: unlike climate change, which unfolds over decades with feedback loops that allow course corrections, a misaligned superintelligence operating at machine speed may not allow a corrective period. Second, irreversibility: most catastrophes, however devastating, leave behind the capacity to rebuild. An ASI that has achieved decisive strategic advantage may not. Third, compounding: a misaligned superintelligence is not simply one more catastrophe to be added to a list. It could cause or amplify the other catastrophes on that list (accelerating climate harm, enabling biological weapons development, concentrating economic power) while simultaneously removing the human capacity to respond to any of them. That combination of speed, irreversibility, and systemic risk is what places it at the top of the priority order.

How is the Nakada Foundation funded?

The Foundation is a privately funded philanthropic initiative. We do not receive government funding. We do not accept funding from AI companies or from organisations with a financial interest in the pace of AI development. Our independence from commercial AI interests is a prerequisite for our work, not incidental to it.

An advocacy organisation for AI governance that is funded by the companies it seeks to govern is not an advocacy organisation. It is a public-relations budget. We are explicit about this because the funding structures of AI policy organisations matter enormously and are frequently obscured. If you are considering supporting our work, we welcome the conversation. You can reach us via the Contact page.

The questions wehear most. Answered.

Understanding the problemis the first step.

The questions we
hear most. Answered.

Understanding the problem
is the first step.