Ep. 153: AI, Alignment, and the Scaling Hypothesis | Dwarkesh Patel

Feb 27, 2024

Guest

Eliezer Yudkowsky

Dwarkesh Patel explores AI obstacles, safety, and scaling hypotheses. Controversy around AI architectures for AGI and aligning AI with human values. Speculation on future podcast growth and advancements in production ease. Closing with banter on beard grooming and deep passion for learning.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Agent Defined As Goal Seeking Over Time

Dwarkesh defines an agent as an entity that wants a goal and is state-oriented toward it over time, spanning simple job-oriented agents to broad goal-seeking systems.
Practical RL benchmarks require measurable, evaluable tasks like code cleanup or pull requests to train agentic behavior.

ADVICE

Create Long Horizon Benchmarks To Detect Agency

Build better benchmarks focused on long-horizon, multi-step tasks because current tests (MMLU, HumanEval) are saturating and won't reveal agentic or higher-order capabilities.
Use tasks where models must iterate, debug, and persist (e.g., chains of requests or AutoGPT-style code fixes).

INSIGHT

Deep Learning Skepticism Versus Empirical Breakthroughs

Skeptics like Gary Marcus argued deep learning lacks necessary structure, but Dwarkesh counters that scaling has repeatedly broken past touted barriers and different architectures can still achieve similar functions.
He compares the mismatch to airplanes versus birds: form can differ while function (flight/intelligence) is achievable.

Get the Snipd Podcast app to discover more snips from this episode

Get the app