LessWrong (Curated & Popular)

LessWrong
undefined
Oct 23, 2023 • 26min

[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis

The podcast explores the challenges of aligning AI with human values and the concept of corrigible AI. It discusses the potential and limitations of Language Model Agents (LLMs) and the repetition trap phenomenon. A debate ensues about the implications of AI alignment challenges and the risks of misgeneralized obedience in AI. Overall, it delves into the complex and evolving field of AI alignment.
undefined
Oct 23, 2023 • 33min

"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.

Simon Lermen and Jeffrey Ladish discuss LoRA fine-tuning and its impact on safety training. They explore the effectiveness of safety procedures, the QloRA technique, dark topics of slurs and brutal killings, effects of model size on harmful task performance, a hypothetical plan for AI attack and control, and the analysis of refusals and comparison of instruction sets.
undefined
Oct 23, 2023 • 50min

"Holly Elmore and Rob Miles dialogue on AI Safety Advocacy" by jacobjacob, Robert Miles & Holly_Elmore

Holly Elmore, organizer of AI Pause protests, and Rob Miles, AI Safety YouTuber, explore the effectiveness of protests, the role of activists in technological advancements, the misconception of technical work, the clash between advocacy and truth-seeking, the importance of rationality, and the significance of advocacy in AI safety.
undefined
Oct 19, 2023 • 3min

"Labs should be explicit about why they are building AGI" by Peter Barnett

The podcast discusses the importance of transparency in AI labs building AGI, stressing the need to communicate risks to the public and policy makers.
undefined
Oct 18, 2023 • 21min

[HUMAN VOICE] "Sum-threshold attacks" by TsviBT

The podcast discusses sum-threshold attacks and the importance of coordinated arguments. It explores adversarial image attacks and how small changes can deceive AI classifiers. The concept of optimization channels and the notion of a vector space representing noticeable features are also explored.
undefined
Oct 18, 2023 • 17min

"Will no one rid me of this turbulent pest?" by Metacelsus

This podcast discusses the potential of gene drives to end malaria and the need for deployment to save lives. It explores the technical details of gene drive construction, ordering DNA, setting up mosquito breeding facilities, implementing gene drives to combat malaria, and overcoming political barriers in Africa.
undefined
Oct 15, 2023 • 10min

[HUMAN VOICE] "Inside Views, Impostor Syndrome, and the Great LARP" by John Wentworth

Yoshua Bengio, a Turing Award winner for deep learning research, discusses the importance of deep models and understanding in ML. Topics include Unitary Evolution Recurrent Neural Networks and gradient explosion/death in recurrent nets. They also explore imposter syndrome and progress in fields through feedback loops and admitting lack of knowledge.
undefined
Oct 15, 2023 • 12min

"RSPs are pauses done right" by evhub

This podcast explores the importance of Responsible Scaling Policies (RSBs) in preventing AI existential risk and emphasizes the need for public support. It discusses the concepts of capabilities evaluation, safety evaluation, and the role of RSP commitments in ensuring AI safety. The significance of mechanistic interpretability and leveraging influence in AI models is also explored. The effectiveness of a labs-first approach in progressing AI technology and the importance of RSBs are discussed. The podcast advocates for robust safety precautions (RSPs) in AI development, highlighting their concrete and actionable nature compared to advocating for a pause in development.
undefined
Oct 15, 2023 • 32min

"Cohabitive Games so Far" by mako yass

The podcast discusses cohabitive games that combine cooperation and competition. It explores the importance of negotiation in board games and introduces character concepts. The podcast also focuses on enhancing cooperative bargaining games and exploring the distinction between maximizing scores and real-world elements. It discusses the factors contributing to group success in cohabitive games and the potential of virtual reality in gameplay.
undefined
Oct 15, 2023 • 7min

"Announcing MIRI’s new CEO and leadership team" by Gretta Duleba

Gretta Duleba, CEO of MIRI, introduces the new leadership team and discusses the shift towards broad public communication. The podcast explores the transition of leadership at MIRI, introduces the new CEO and team, and discusses their strategic plans to address AI's existential risk.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app