The New Stack Podcast cover image

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

The New Stack Podcast

00:00

Why denoising yields faster inference

Stefano explains parallel token refinement on GPUs and how fewer denoising steps enable large speedups.

Play episode from 11:19
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app