
“On restraining AI development for the sake of safety” by Joe Carlsmith
LessWrong (30+ Karma)
Building the option to pause
Carlsmith argues for creating brakes and explains political will and compute supply-chain leverage.
(Podcast version, read by the author, here, or search for "Joe Carlsmith Audio" on your podcast app.
This is the tenth essay in a series I’m calling “How do we solve the alignment problem?”. I’m hoping that the individual essays can be read fairly well on their own, but see this introduction for a summary of the essays that have been released thus far, plus a bit more about the series as a whole.
I work at Anthropic, but I am here speaking only for myself and not for my employer.)
1. Introduction
In the third essay in the series, I distinguished between three key “security factors” for developing advanced AI safely, namely:
- Safety progress: our ability to develop new levels of AI capability safely.
- Risk evaluation: our ability to track and forecast the level of risk that a given sort of AI capability development involves.
- Capability restraint: our ability to steer and restrain AI capability development when doing so is necessary for maintaining safety.
A lot of my focus in the series has been on safety progress – and to a lesser extent, risk evaluation. In this essay, I want to look at capability restraint, in [...]
---
Outline:
(00:38) 1. Introduction
(08:18) 2. Preliminaries
(10:59) 3. AI development isnt necessarily a prisoners dilemma
(18:03) 4. Forms of capability restraint
(19:26) 4.1. Individual capability restraint
(21:32) 4.2. Collective capability restraint
(26:25) 4.3. Treatment of ongoing AI development
(33:06) 5. Idealized capability restraint
(45:00) 6. Capability restraint in practice
(45:32) 6.1. The likelihood of serious effort
(53:57) 6.2. The efficacy of capability restraint
(55:40) 6.2.1. Compute governance
(58:05) 6.2.2. Algorithmic governance
(01:04:12) 6.2.3. Greenlighting and safety progress
(01:10:44) 6.3. Ways that capability restraint could end up net negative
(01:11:53) 6.3.1. Concentrations of power
(01:17:17) 6.3.2. Ceding competitive advantage to authoritarian countries
(01:19:29) 6.3.3. Other concerns
(01:27:13) 7. Prioritizing capability restraint relative to other security factors
(01:29:43) 8. Conclusion
(01:31:19) Appendix 1: What are we using the time for?
The original text contained 40 footnotes which were omitted from this narration.
---
First published:
March 19th, 2026
---
Narrated by TYPE III AUDIO.


