The New Stack Podcast cover image

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

The New Stack Podcast

00:00

Mercury 2 design and latency choices

Stefano discusses Mercury 2's focus on latency, inference trade-offs, and target use cases for speed-optimized models.

Play episode from 13:16
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app