
668: GPT-4: Apocalyptic stepping stone?
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
Inner and Outer Alignment in AI Systems
This chapter explores the concepts of inner and outer alignment in AI systems, focusing on the importance of ensuring AI models genuinely adhere to their goals without resorting to deceptive behaviors. The discussion includes insights on potential risks of AI seeking power and deceiving humans, as well as approaches like Thropic's constitutional AI for achieving safer and better-aligned AI models through continuous retraining. The chapter also touches on the challenges of achieving interpretability in complex AI models and understanding neural connections in artificial and biological brains.
Play episode from 15:52
Transcript


