#238 - GPT 5.4 mini, OpenAI Pivot, Mamba 3, Attention Residuals

337 snips

Mar 26, 2026

Breaking AI product launches and model releases are unpacked, from ultra-long-context mini/nano models to an open-sourced MoE family. Local and sandboxed agent runtimes for Macs and servers are debated alongside Nvidia’s graphics AI and chip forecasts. Business pivots, geopolitical GPU moves, and fresh safety and research work on steganography, chain-of-thought, attention residuals, and state-space model advances are highlighted.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Many Chains Of Thought Are Performative Not Reflective

'Reasoning Theater' shows many chains of thought are performative: models often know answers early but still produce verbose reasoning.
Jeremie notes attention probes and forced-answer interrupts reveal internal confidence can precede the textual chain of thought, especially on easy tasks.

ADVICE

Avoid Giving Models Excess Compute For Simple Tasks

Avoid over-provisioning model capability relative to task difficulty to reduce risky or unnecessary behavior.
Jeremie suggests matching model size and test‑time compute to problem difficulty to limit performative reasoning and unnecessary capability overhang.

INSIGHT

Interleave General Instructions To Prevent Fine Tuning Misalignment

In‑training defenses against emergent misalignment work best by interleaving general instruction data with fine‑tuning data, rather than only KL or activation regularization.
Jeremie finds the interleaving approach preserves safety behavior while allowing targeted fine‑tuning.

Get the Snipd Podcast app to discover more snips from this episode

Get the app