The Nonlinear Library cover image

LW - The Shutdown Problem: Incomplete Preferences as a Solution by EJT

The Nonlinear Library

00:00

Analyzing Shutdown Decisions in AI Agents

The chapter explores the concept of shutdown decisions in AI agents, focusing on the trade-offs between immediate utility and expected total utility. It delves into the use of lotteries and probability mass diagrams to analyze the implications of resisting shutdown versus not resisting. The discussion encompasses the behavior of TD agents, the concept of time step dominance, and the training methods needed to instill adherence to shutdown policies in advanced AI.

Play episode from 23:32
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app