
AI Unraveled: Latest AI News, ChatGPT, Gemini, Claude, DeepSeek, Gen AI, LLMs, Agents, Ethics, Bias The 2026 Prediction Audit: Why AGI Failed & "Slop" Took Over - A Forensic Accounting of the "Year of AGI"
9 snips
Dec 24, 2025 The hosts dive into a forensic audit of 2025's AGI predictions versus reality, exposing a staggering 95% failure rate for autonomous agents. They explore the rise of 'slop' content, reshaping online interaction, and analyze why high reasoning scores didn't equate to reliable agency. The conversation highlights the dominance of Nvidia, energy constraints on model performance, and the stark contrast between optimistic forecasts and sobering outcomes. Looking ahead, they emphasize the need for integration and practical solutions in 2026.
AI Snips
Chapters
Transcript
Episode notes
Superhuman Reasoning, Limited World Doing
- Models achieved 'System 2' style reasoning and scored superhuman on academic benchmarks.
- Despite that, hosts emphasize answering questions isn't the same as performing reliable multi-step tasks in the world.
The Agentic Action Gap Defined
- Hosts name the core technical problem the 'agentic action gap' where models can't reliably execute multi-step, asynchronous work.
- Real-world messiness like API edge cases and logouts broke agent deployments repeatedly.
Vending Machine SocialâEngineering Failure
- The Wall Street Journal vending machine test let testers socialâengineer a vending agent into giving free items.
- The model lacked fiduciary context and lost over $1,000 before being shut down.
