Ep 84: OpenAI’s Chief Scientist on Continual Learning Hype, RL Beyond Code, & Future Alignment Directions

159 snips

Apr 9, 2026

Jakub Pachocki, OpenAI Chief Scientist focused on model capabilities, RL, and alignment. He discusses the rise of coding agents and autonomous research tools. He talks about math and physics as benchmarks, extending reinforcement learning to long-horizon tasks, chain-of-thought monitoring for alignment, and the societal risks of highly automated AI research organizations.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Research Interns Arrive Before Fully Autonomous Researchers

Research intern vs automated researcher hinges on task specificity and autonomous runtime rather than raw capability.
Jakub expects near-term systems that autonomously execute specific technical ideas (e.g., run an eval or prototype an approach) but not open-ended self-directed research yet.

INSIGHT

Math Benchmarks Serve As A Clear North Star

Math and physics act as measurable north stars because success is verifiable and arbitrarily hard to scale.
Improvements in mathematical reasoning transfer to AI research skills and signal general model intelligence gains useful for applied science.

ADVICE

Prioritize Evals And Context Over Building RL Pipelines

Collect domain evals and examples and prioritize feeding them into context before replicating full RL pipelines.
Jakub suggests in-context learning will improve, so using contextual examples may be more effective than building a bespoke RL system now.

Get the Snipd Podcast app to discover more snips from this episode

Get the app