The Stack Overflow Podcast cover image

Generating text with diffusion (and ROI with LLMs)

The Stack Overflow Podcast

00:00

Memory bandwidth and inference efficiency

Stefano explains diffusion models' better arithmetic and memory-bandwidth efficiency versus autoregressive generation bottlenecks.

Play episode from 10:35
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app