The Problem with Defining an "AGI Ban" by Outcome— A lawyer's take.
[Crossposted from LessWrong]
I keep seeing proposals framed as “international bans on AGI,” where AGI is defined extremely broadly, often something like “whatever AI companies develop that could lead to human extinction.” As a lawyer, I can’t overstate how badly this type of definition fails to accomplish its purpose. To enable a successful ban on AGI, regulation has to operate ex ante: before the harm materialises.
If the prohibited category is defined only by reference to the ultimate outcome (in this case, human extinction), then by construction the rule cannot trigger until after the damage is done. At that point the law is meaningless; extinction leaves no survivors to enforce it.
That’s why the definitional work must focus not on the outcome itself, but on the specific features that make the outcome possible: the capabilities, thresholds, or risk factors that causally lead to the catastrophic result. Other high-stakes tech domains already use this model. Under the European General Data Protection Regulation, companies can be fined simply for failing to implement adequate security measures, regardless of whether an intentional breach has occurred. Under European product liability law, a manufacturer is liable for a defective product even if they exercised all possible care to prevent such defect. And even under U.S. export-control law, supplying restricted software without a licence is an offence regardless of intent.
The same logic applies here: to actually ban AGI or AI leading to human extintion, we need to ban the precursors that make extinction-plausible systems possible, not "the possibility of extinction" itself.
The luxury of "defining the thing" ex post
TsviTB, in What could a policy banning AGI look like?, poses a fair question: “Is it actually a problem to have fuzzy definitions for AGI, when the legal system uses fuzzy definitions all the time?”
In ordinary law, the fuzziness is tolerable because society can absorb the error. If a court gets it wrong on whether a death was murder or manslaughter, the consequences are tragic but not civilisation-ending. And crucially, many of these offences carry criminal penalties (actual prison time) which creates a strong incentive not to dance around the line.
But an AGI ban is likely to sit under product safety law, at least initially (more on this next), where penalties are usually monetary fines. That leaves the door wide open for companies to “Goodhart” their way into development: ticking compliance boxes while still building systems that edge toward the prohibited zone.
This is a painfully common dynamic in corporate law. For example, multinationals routinely practice tax avoidance right up to the legal line of tax evasion. They can prove with audited books that what they’re doing is legal, even if it clearly undermines the spirit of the law. They still avoid paying far more tax than they would under a normal corporate structure, one SMEs can’t afford to set up.
In practice, they achieve the very outcome the law was designed to prevent, but they do it legally. They don’t get away with all of it, but they get away with maybe 80%.
We don't have this luxury: we cannot afford an AGI ban that is “80% avoided.” Whether the framework sits under civil or criminal law, it will only work if it sets a robust, precise threshold and attaches penalties strong enough to change incentives, not just fines companies can write off as a cost of doing business.
Actually defining the thing we want to ban
If an agreement like this is to work, the first item on the agenda must be to define what counts as the thing you want to ban.
Why do I think an AGI ban will default to a product safety framework rather than a criminal law framework? Because that’s the path the EU AI Act has already taken. It sets the first precedent for “banning” AI: as outlined in Article 5, certain systems are prohibited from being put on the market or deployed when they pose irreversible societal harms (e.g. manipulative systems, exploitative targeting, biometric categorisation, social scoring).
But notice the structure:
It doesn’t ban development of these systems. Technically, a company can still build them internally.
It creates civil liability, mainly monetary fines. That’s deterrence, not prevention.
Enforcement is ex post: the ban only bites once the system has been deployed and the harm has become measurable.
This is exactly the failure mode I worry about. If an “AGI ban” is drafted the same way, it will look tough on paper but in practice it will be little more than a product-safety regulation: companies treating fines as a cost of doing business, and governments tolerate it for the sake of innovation.
That’s why the definitional work matters so much. If the ban is going to be enforceable, we can’t define AGI in terms of its final outcome (extinction) or leave it to vague product-safety language.
If the AI Safety field fails to define the thing we want to ban, the task will be left up to policymakers, who will reach for the "most measurable" available proxy ( likely compute thresholds) and the entire ecosystem will Goodhart against that.
We need a crisp proxy for what makes a system cross the danger line: capability evaluations, autonomy thresholds, demonstrable ability to replicate, deceive, or seize resources. Something specific enough that strict liability can attach before catastrophe, not after.
Credible bans depend on bright lines
The political reality is that states are unlikely to agree to halt all frontier development. This is clear for anyone who has read the U.S.' AI Action plan. Even Europe, often seen as the strictest regulator, is taking an ambitious "pro-innovation" stance.
If a proposal for an AGI ban is to succeed, it has to be precise enough to block “AGI coup”-class systems while still permitting beneficial progress. Otherwise, it will be dismissed as too restrictive or difficult to actually enforce without expensive innovation caps.
Learning from nuclear treaties
Nuclear treaties don’t ban “whatever weapons might end humanity.” They ban specific precursor states and activities with crisp, enforceable thresholds: zero-yield tests, “significant quantities” of fissile material (8 kg of plutonium, 25 kg of HEU), and delivery systems above 500 kg/300 km. These bright lines allow intrusive verification and enforcement, while still permitting conventional weapons to exist1. The principle is clear: regulate by measurable inputs and capabilities, not by catastrophic outcomes.
Until similar definitional work is done for AGI, talk of an “AGI ban” is rhetoric.
But useful rhetoric, yes!: It shifts the Overton window, it mobilises groups like
and , it keeps extinction-level risk on the policy agenda. But as law, it will fail unless we solve the definitional problem and bring legal experts into the room (not just policymakers2). Otherwise, in-house counsels at AI labs will simply map the gray areas and find compliant-looking ways to bypass the ban.If we want a credible ban, we need to put the same intellectual effort into defining precursor thresholds for AI that we once put into fissile material, missile ranges, and test yields. Anything less will collapse into Goodharting and "product safety rules" with fines as the only compliance incentive.
I do not endorse the manufacturing of any weapons, I am merely using this as an example for illustrative purposes.
One blind spot I notice is how rarely tech lawyers are brought into AI safety strategy. Lawyers in large firms or in-house roles often sit at the chokepoints of real leverage: they can delay deals, rewrite contractual terms, and demand transparency in ways that policymakers often cannot. In the EU especially, an AGI ban (or any ambitious legislative action) would ultimately be implemented, interpreted, and either undermined or strengthened by these lawyers. If they are left out of the conversation, the path of least resistance will be to map the gray areas and advise clients on how to bypass the spirit of the law.
Lone Thomasky & Bits&Bäume / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/



Very true, we cannot rely on vague definitions alone. That's why our proposal has clear red lines: it bans training of AI models that are either 1) larger than 10^12 parameters, 2) have more than 10^25 FLOPs used for training or 3) capabilities that are expected to exceed a score of 86% on the MMLU benchmark.
These lines might be overly strict (maybe ASI will require 100x the scale, we do not know now), but they could also be too permissive (maybe algorithmic progress leads to ASI being trainable on a desktop computer in a couple of years).
Finding the perfect red lines is a valuable endeavor, but inherently difficult. Underestimating the red lines could lead to human extinction. We should therefore err on the side of caution.
Great article, thank you for all your writing. Working in Tokyo, I have what might be a useful counterexample on this take: *"This is a painfully common dynamic in corporate law. For example, multinationals routinely practice tax avoidance right up to the legal line of tax evasion. They can prove with audited books that what they’re doing is legal, even if it clearly undermines the spirit of the law."*
Experience adjacent to the Japan National Tax Authority has taught me that they are far more likely than Western tax authorities to sanction companies walking up to the line with "tax mitigation". They do this with a combination of ambiguous lines (rather than bright lines), and unilateral decisions about when a violation of the spirit of the law is an actual violation of the law, regardless of the law's text.
The downside of this approach would be potential for corruption and lack of stability. But this is entirely mitigated by (1) the NTA's professionalism and (2) the NTA's general willingness to publish Q&A explainers (with Qs posed by corporate counsel) on the NTA's approaches to tax - none of which are binding on NTA decisions if they contain loopholes exploited in bad faith, but all of which are respected and taken seriously for the effort put into them.
A hypothetical AI Authority that took its idea of authority seriously, could act in similar ways: compasionate to the concerns of AI companies in understanding the rules, but rutheless in upholding the spirit of an AGI prohibition.