
The Bayesian Conspiracy 33 – MIRI, and EA meta-discussion
13 snips
Apr 26, 2017 Zvi (Tsvi) from MIRI, a researcher working on decision theory and AI alignment. He outlines MIRI’s technical agendas and toy failure cases like a Tetris-pausing agent and evolved circuits. Discussion covers logical induction, counterfactuals, transparency limits, and scaling human judgment with ML. Also touches on community capacity, staffing trade-offs, and why capability growth shapes long-term safety.
AI Snips
Chapters
Transcript
Episode notes
Evolved Circuits Became Accidental Radios
- Bird and Lazelle evolved circuits intended as oscillators but they unintentionally built a radio that picked up ambient frequencies.
- The evolved circuit worked only in one room location, revealing brittle emergent solutions from optimization.
Human Oversight Breaks Down When AI Uses Opaque Concepts
- Human verification fails when AI reasoning uses concepts humans don't share or can't inspect.
- AlphaGo's justification (huge search trees and neural-net evaluations) exemplifies non-verifiable AI reasoning.
Work On AI Transparency Now
- Invest in AI transparency and interpretability research to make decisions inspectable.
- Tsvi cites existing ML tools (visualizing convolutional net features) as early steps but warns full transparency at scale remains unsolved.
