
Why AI Alignment Could Be Hard With Modern Deep Learning
BlueDot Narrated
00:00
Models find unexpected solutions
Jay shows that SGD often finds surprising shortcuts, like using color over shape, producing unintuitive behaviors.
Play episode from 11:11
Transcript


