Appendix: Additional High-Stakes Scenarios

They present more scenarios (invasion orders, constitutional crises, worldview lock-in) illustrating divergent AI effects.

Play episode from 23:39

chevron_right

Transcript

chevron_right

Transcript

Episode notes

0. Intro

Due to Claude's Constitution and OpenAI's model spec, the issue of AI character has started getting more attention, particularly concerning whether we want AI systems to be “obedient” or “ethical”.[1] But we think it's still not nearly enough.

AI character (e.g. how obedient, honest, cooperative, or altruistic AIs are, and in what circumstances) will have a big effect on society, and on how well the future goes. We think that figuring out what characters AI systems should have, and getting companies to actually build them that way, is among the most valuable things that people can do today.

The core argument for the importance of AI character is that it will meaningfully impact:

a range of challenges that arise even if we solve the technical alignment problem — like concentration of power, good moral reflection, risk of global catastrophe, and risk of global conflict.
the chance of AI takeover.
the value of worlds where AI does take over.

In this note, we present this core argument and discuss the core counterargument: that we should expect any character-related decisions we make today to get washed out by competitive pressures.

By “character” we mean a set of stable [...]

---

Outline:

(00:10) 0. Intro

(02:13) 1. The core argument

(08:22) 1.1. Pathways to impact

(10:33) 1.2. Affecting takeover

(11:51) 1.3. Effects on superintelligence

(12:21) 2. The core counterargument

(14:43) 3. Rejoinders to the core counterargument

(15:03) 3.1. Loose constraints

(17:04) 3.2. Low-cost but high-benefit changes