The Nonlinear Library cover image

LW - The Shutdown Problem: Incomplete Preferences as a Solution by EJT

The Nonlinear Library

00:00

Balancing Shutdown Ability and Usefulness in Artificial Agents

Exploring the challenges of ensuring artificial agents can be shut down when needed while remaining purposeful, the chapter proposes training agents with incomplete preferences between different length trajectories. It emphasizes the importance of distinguishing preferences for actions, trajectories, and lotteries to control shutdown scenarios effectively. By introducing the concept of preferential gaps, the chapter offers insights into training agents to make choices that prevent undesirable outcomes and enhance shutdown ability.

Play episode from 04:13
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app