LessWrong (30+ Karma)

“Agents Can Get Stuck in Self-distrusting Equilibria” by Ashe Vazquez Nuñez

Mar 26, 2026
Ashe Vásquez Núñez, a researcher on embedded agency and identities in intrapersonal games, explains how an agent's temporal selves can distrust each other. Short takes cover models of self-distrust, how coercion and commitment shape coherence, identities as coordination tools, and a formal toy framework showing stable self-punishing patterns.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Internal Conflict Is A Game Between Temporal Instances

  • Agents' temporal instances (TIs) can conflict internally and be modeled as an intrapersonal cooperative game.
  • Ashe Vásquez Núñez frames preference reversals and internal discord as game-theoretic dynamics between TIs rather than mere irrationality.
ANECDOTE

Charlie Stays Up Late And Breaks Her Own Plan

  • Charlie stays up late to avoid guilt, which sabotages tomorrow's productive plan and locks her into a self-perpetuating cycle.
  • The example shows a stuck suboptimal equilibrium where no TI accepts a short-term loss to enable a better future outcome.
ANECDOTE

Dean Distrusts Future Selves And Keeps A Streak

  • Dean mirrors Charlie's inadequate identity but the root is distrust: he doubts future TIs will actually work hard even if he sacrifices now.
  • This contrasts motivations: Charlie won't take the hit; Dean refuses because he distrusts successors' follow-through.
Get the Snipd Podcast app to discover more snips from this episode
Get the app