The Bayesian Conspiracy

33 – MIRI, and EA meta-discussion

13 snips
Apr 26, 2017
Zvi (Tsvi) from MIRI, a researcher working on decision theory and AI alignment. He outlines MIRI’s technical agendas and toy failure cases like a Tetris-pausing agent and evolved circuits. Discussion covers logical induction, counterfactuals, transparency limits, and scaling human judgment with ML. Also touches on community capacity, staffing trade-offs, and why capability growth shapes long-term safety.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Evolved Circuits Became Accidental Radios

  • Bird and Lazelle evolved circuits intended as oscillators but they unintentionally built a radio that picked up ambient frequencies.
  • The evolved circuit worked only in one room location, revealing brittle emergent solutions from optimization.
INSIGHT

Human Oversight Breaks Down When AI Uses Opaque Concepts

  • Human verification fails when AI reasoning uses concepts humans don't share or can't inspect.
  • AlphaGo's justification (huge search trees and neural-net evaluations) exemplifies non-verifiable AI reasoning.
ADVICE

Work On AI Transparency Now

  • Invest in AI transparency and interpretability research to make decisions inspectable.
  • Tsvi cites existing ML tools (visualizing convolutional net features) as early steps but warns full transparency at scale remains unsolved.
Get the Snipd Podcast app to discover more snips from this episode
Get the app