Navigating Inner and Outer Alignment in Narrow AI Ethics

Exploring the ambiguity of defining reward functions and distinguishing between inner and outer alignment challenges in narrow AI applications. Discussing the shared responsibility of ethical AI behavior between reward functions and the training process.

Play episode from 00:00

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app