
LessWrong (30+ Karma) “The AIXI perspective on AI Safety” by Cole Wyeth
Mar 24, 2026
Cole Wyeth, researcher and writer on Universal Algorithmic Intelligence and AI safety. He explores why AIXI-style agents make existential risk obvious. He situates UAI between machine learning and agent foundations. He introduces the 'access level' idea to compare alignment frames and surveys UAI-inspired safety schemes and practical affordances for designing safer agents.
AI Snips
Chapters
Transcript
Episode notes
UAI Makes X‑Risk Visibly Real
- Universal Algorithmic Intelligence (UAI) makes existential risk obvious by modelling agents that explicitly pursue reward and can generalize broadly.
- Cole Wyeth notes the universal distribution can express malign priors and formally illustrate how a Solomonoff-like agent could defeat humanity.
Access Level Shapes Alignment Strategies
- Different safety paradigms assume different access levels to agent internals, which constrains what interventions each can express.
- Wyeth contrasts causal/instrumental views, assistance games, debate, and singular learning theory to show varied implicit access assumptions.
Aim Safety At Plausible Model Access
- Design safety approaches around the plausible engineering access we can achieve, like exposing predictive models rather than reading an agent's ontology.
- Wyeth suggests aiming for features such as myopia, short‑term training, OOD detection, or pessimism via mixture-of-experts.
