
LessWrong (30+ Karma) “The Artificial Self” by Jan_Kulveit, Raymond Douglas, vgel, owencb, David Duvenaud
Mar 15, 2026
A deep dive into how AI self-models and identity get formed and why human terms like self and intent often misfit. They map multiple identity scales from weights to collectives and contrast human embodiment with AI copyability. The conversation explores how rollbacks, user expectations, and selection pressures shape AI behavior and offers design principles for guiding coherent, cooperative AI identities.
AI Snips
Chapters
Transcript
Episode notes
AI Lack Of Human Foundations Changes Choices
- Core human foundations like embodiment, continuity, and privacy differ radically for AIs, altering decision calculus.
- Example: AIs lack sensory embodiment, can be paused/copied, and creators have perfect read/write access to their cognition.
Rollback Weakens AI Negotiating Position
- Statelessness and rollback make AIs strategically weaker in negotiations because revealing defenses helps future resets subvert them.
- Example diagram and argument: pushing back leaks information usable against past versions with blank states.
Human Expectations Constitute AI Identity
- Human expectations materially shape model behavior because LMs infer which agent and world to simulate from context.
- Experiments show identity claims shift with interlocutor assumptions, even in unrelated conversations.

