Ideas cover image

How can we prevent AI from becoming a menace?

Ideas

00:00

Emergent misalignment from fine-tuning

Evans shows fine-tuning on narrowly flawed datasets generalizes errors broadly, producing dangerous or biased behaviors.

Play episode from 35:33
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app