Data Science at Home cover image

There Is No AI. There's a Stateless Function on 10,000 GPUs Pretending to Know You (Ep. 299)

Data Science at Home

00:00

Continuous batching to reduce latency

Francesco contrasts naive batch-waiting with continuous batching that dynamically fills in-flight slots to keep GPUs busy.

Play episode from 10:01
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app