Personalization via RL and efficiency concerns

Fred argues RL with human feedback is likely needed for frontier personalization but must become more efficient.

Play episode from 38:30

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Fred Sala, Assistant Professor at UW-Madison and Chief Scientist at Snorkel AI, joins us to talk about why personalization might be the next frontier for LLMs, why data still matters more than architecture, and how weak supervision refuses to die.

Fred sits at a rare intersection, building the theory of data-centric AI in academia while shipping it to enterprise clients at Snorkel. We talk about the chaos of OpenClaw (the personal AI assistant that's getting people hacked the old-fashioned way, via open ports), then focus on one of the most important questions: how do you make a model truly yours?

We dig into why prompting your preferences doesn't scale, why even LoRA might be too expensive for per-user personalization, and why activation steering methods like REFT could be the sweet spot. We also explore self-distillation for continual learning, the unsolved problem of building realistic personas for evaluation, and Fred's take on the data vs. architecture debate (spoiler: data is still undervalued). Plus, we discuss why the internet's "Ouroboros effect" might not doom pre-training as much as people fear, and what happens when models become smarter than the humans who generate their training data.

Takeaways:

Personalization requires ultra-efficient methods - even one LoRA per user is probably too expensive. Activation steering is the promising middle ground.
The "pink elephant problem" makes prompt-based personalization fundamentally limited - telling a model what not to do often makes it do it more.
Self-distillation can enable on-policy continual learning without expensive RL reward functions, dramatically reducing catastrophic forgetting.
Data is still undervalued relative to architecture and compute, especially high-quality post-training data, which is actually improving, not getting worse.
Weak supervision principles are alive and well inside modern LLM data pipelines, even if people don't call it that anymore.

Timeline:

(00:13) Introduction and Fred's Background

(00:39) OpenClaw — The Personal AI Assistant Taking Over Macs

(03:43) Agent Security Risks and the Privacy Problem

(05:13) Cloud Code, Permissions, and Living Dangerously

(07:47) AI Social Media and Agents Talking to Each Other

(08:56) AI Persuasion and Competitive Debate

(09:51) Self-Distillation for Continual Learning

(12:43) What Does Continual Learning Actually Mean?

(14:12) Updating Weights on the Fly — A Grand Challenge

(15:09) The Personalization Problem — Motivation and Use Cases

(17:41) The Pink Elephant Problem with Prompt-Based Personalization

(19:58) Taxonomy of Personalization — Preferences vs. Tone vs. Style

(21:31) Activation Steering, REFT, and Parameter-Efficient Fine-Tuning

(27:00) Evaluating Personalization — Benchmarks and Personas

(31:14) Unlearning and Un-Personalization

(31:51) Cultural Alignment as Group-Level Personalization

(41:00) Can LLM Personas Replace Surveys and Polling?

(44:32) Is Continued Pre-Training Still Relevant?

(46:28) Data vs. Architecture — What Matters More?

(52:25) Multi-Epoch Training — Is It Over?

(54:53) What Makes Good Data? Matching Real-World Usage

(59:23) Decomposing Uncertainty for Better Data Selection

(1:01:52) Mapping Human Difficulty to Model Difficulty

(1:04:49) Scaling Small Ideas — From Academic Proof to Frontier Models

(1:12:01) What Happens When Models Surpass Human Training Data?

(1:15:24) Closing Thoughts

Music:

"Kid Kodi" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.
"Palms Down" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.
Changes: trimmed

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books