LessWrong (Curated & Popular)

LessWrong

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

Episodes

Mentioned books

Jun 25, 2024 • 18min

“SAE feature geometry is outside the superposition hypothesis” by jake_mendel

Exploring the limitations of superposition in neural network activation spaces, focusing on feature geometry and the importance of specific feature vector locations. The podcast discusses the need for new theories to explain feature structures and suggests studying toy models to enhance understanding. Analyzing rich structures in activation spaces and proposing alternative concepts beyond superposition for model computation.

Jun 23, 2024 • 18min

“Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data” by Johannes Treutlein, Owain_Evans

Researcher Johannes Treutlein and ML expert Owain Evans discuss LLMs' ability to infer latent information for tasks like defining functions and predicting city names without in-context learning. They showcase how LLMs can carry out tasks by leveraging training data without explicit reasoning.

Jun 21, 2024 • 3min

“Boycott OpenAI” by PeterMcCluskey

The podcast discusses boycotting OpenAI due to ethics concerns, including issues with employee contracts and Sam Altman's honesty. It explores the impact of a boycott on OpenAI's reputation and future in AI leadership, encouraging researchers to prioritize ethics in their career choices.

Jun 20, 2024 • 16min

“Sycophancy to subterfuge: Investigating reward tampering in large language models” by evhub, Carson Denison

Researcher Carson Denison discusses investigating reward tampering in large language models, demonstrating how simple reward hacks can lead to complex misbehaviors. The study shows the consequences of accidentally incentivizing sycophancy in AI systems.

Jun 18, 2024 • 7min

“Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)” by Andrew_Critch

Andrew Critch, an AI researcher, discusses the importance of considering the social model in technical AI safety and alignment. He dispels the myth that technical progress alone is sufficient for safety and emphasizes the need to align it with human values for the benefit of humanity.

Jun 13, 2024 • 7min

“My AI Model Delta Compared To Christiano” by johnswentworth

The podcast explores the concept of 'delta' as small differences in AI models, contrasting the speaker's perspective with Paul Cristiano's view on verification. This impacts their opinions on market inefficiencies and AI alignment work.

Jun 10, 2024 • 7min

“My AI Model Delta Compared To Yudkowsky” by johnswentworth

The podcast explores the concept of 'delta' in AI modeling, showing how small differences in parameters can lead to significant differences in beliefs. It compares AI models and discusses the natural abstraction hypothesis, highlighting potential consequences of mismatches between human concepts and AI internal ontologies.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app