The Leverage Podcast

Google's New AI Model Changes Who Actually Controls AI

7 snips
Apr 10, 2026
A deep dive into Google's open-source breakthrough and how a smaller, powerful model is now widely available. A new compression method that slashes memory needs by about sixfold is explained. Discussion on shifting compute from massive data centers to laptops and phones. Exploration of who gains and who loses economically, continued cloud roles, and rising privacy and safety trade-offs.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Mobile Models Match Last Year’s Frontier AI

  • Gemma 4 is an open source model that matches OpenAI’s best models from 18 months ago while being small enough to run on a phone.
  • That performance-per-watt improvement means frontier capabilities will become accessible on consumer devices within ~18–24 months.
INSIGHT

TurboQuant Cuts Memory Needs Sixfold

  • TurboQuant compresses model memory by roughly 6x, cutting the RAM needed to run large models on phones and cheap computers.
  • The 6x reduction is framed as a paradigm shift because it removes a major hardware barrier to local inference.
INSIGHT

Software, Not Just Hardware, Makes Local AI Practical

  • Combined, Gemma 4 and TurboQuant turn models that once needed multi‑GPU data centers into software that can run on consumer laptops and phones.
  • This shift stems from software-level compression and model innovations, not just better GPUs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app