LessWrong (Curated & Popular)

LessWrong
undefined
Aug 21, 2024 • 19min

“AGI Safety and Alignment at Google DeepMind:A Summary of Recent Work ” by Rohin Shah, Seb Farquhar, Anca Dragan

Join Rohin Shah, a key member of Google's AGI safety team, alongside Seb Farquhar, an existential risk expert, and Anca Dragan, a safety researcher. They dive into the evolving strategies for ensuring AI alignment and safety. Topics include innovative techniques for interpreting neural models, the challenges of scalable oversight, and the ethical implications of AI development. The trio also discusses future plans to address alignment risks, emphasizing the importance of collaboration and the role of mentorship in advancing AGI safety.
undefined
Aug 15, 2024 • 20min

“Fields that I reference when thinking about AI takeover prevention” by Buck

Explore the parallels between AI takeover risks and other high-stakes scenarios like nuclear meltdowns. Discover how insights from computer security and physical safety engineering contribute to robust AI safety measures. Delve into the history of power structures to understand their relevance in current AI control discussions. Learn about the complexities of insider threats and the importance of regulatory frameworks in safeguarding sensitive technological environments.
undefined
Aug 13, 2024 • 38min

“WTH is Cerebrolysin, actually?” by gsfitzgerald, delton137

Dan Elton, a neuroscience blogger, and Bryan Johnson, a health-focused entrepreneur, dive into the controversial substance Cerebrolysin. They discuss its dubious origins from pig brain tissue and the hype surrounding its supposed cognitive benefits. The pair scrutinize the questionable scientific backing behind its claims and the misleading marketing tactics used for promotion. They also highlight the lack of regulatory oversight and call for a more transparent evaluation of its effectiveness as a treatment for neural diseases.
undefined
Aug 10, 2024 • 23min

“You can remove GPT2’s LayerNorm by fine-tuning for an hour” by StefanHex

Dive into the fascinating world of fine-tuning GPT-2 as researchers tackle the removal of Layer Normalization. Discover the interpretability challenges posed by this modification and how it impacts model performance. Listen as they break down the methodologies used and compare results of the modified model against traditional setups. The conversation also covers theoretical insights regarding generalization and training stability, making for an engaging exploration of AI model optimization.
undefined
Aug 9, 2024 • 4min

“Leaving MIRI, Seeking Funding” by abramdemski

The host reflects on leaving a research position at MIRI and the shifting landscape of funding challenges. They discuss new directions in agent foundations and the crucial role of trust among intelligent systems. The conversation delves into the contrast between public and private research, highlighting the need for transparency while grappling with the complications of secrecy. Ultimately, the speaker shares their journey toward securing funding and a renewed focus on impactful research.
undefined
Aug 8, 2024 • 4min

“How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage” by orthonormal

Discover the intriguing world of prediction markets and their pitfalls. The discussion dives into a flawed market that stirs up controversy around a political candidate’s VP pick. It reveals how easily these markets can be manipulated to promote specific political agendas. Tune in to hear about the speaker's journey from skepticism to appreciation for the entertaining chaos of prediction markets, all while keeping an eye on their real-world implications.
undefined
Aug 7, 2024 • 16min

“This is already your second chance” by Malmesbury

A colossal ivory cube descends, carrying instructions to save humanity from an AI apocalypse. In a humorous twist, Kublai Khan engages in witty banter with an AI while tackling ethical dilemmas surrounding super-intelligent technology. The tale involves absurd tasks to be completed in 2024, blending satire with philosophical musings. With imaginative storytelling, it highlights the challenges of navigating current technological threats and reflects on human behavior in the face of impending doom.
undefined
Aug 7, 2024 • 20min

“0. CAST: Corrigibility as Singular Target” by Max Harms

Dive into the intriguing concept of corrigibility in AI, where the discussion pivots from confusion to clarity. Discover how this single property can be crucial for creating agents that are both effective and safe. Learn about innovative strategies for measuring and enhancing this quality in AI development. The podcast critiques the usual mix of goals and proposes a streamlined focus to improve outcomes. Prepare for a journey through the nuances of AI behavior and safety that could redefine future advancements.
undefined
Aug 7, 2024 • 23min

“Self-Other Overlap: A Neglected Approach to AI Alignment” by Marc Carauleanu, Mike Vaiana, Judd Rosenblatt, Diogo de Lucena

Join guests Bogdan Ionut-Cirstea, Steve Byrnes, Gunnar Zarnacke, Jack Foxabbott, and Seong Hah Cho, who contribute critical insights on AI alignment. They discuss an intriguing concept called self-other overlap, which aims to optimize AI models by aligning their reasoning about themselves and others. Early experiments suggest this technique can reduce deceptive behaviors in AI. With its scalable nature and minimal need for interpretability, self-other overlap could be a game-changer in creating pro-social AI.
undefined
Aug 7, 2024 • 9min

“You don’t know how bad most things are nor precisely how they’re bad.” by Solenoid_Entity

Dive into the intriguing world of discernment, where time and attention significantly enhance our understanding of quality. Explore the nuances of piano tuning, revealing how even experts struggle to detect subtle flaws. Discover the complexities of awareness, and how often we overlook our own blind spots. This discussion highlights the perils of relying on automation in tasks requiring skilled judgment, emphasizing the intricate details in reality that often go unnoticed.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app