Latent Space: The AI Engineer Podcast cover image

NVIDIA's AI Engineers: Agent Inference at Planetary Scale and "Speed of Light" — Nader Khalil (Brev), Kyle Kranen (Dynamo)

Latent Space: The AI Engineer Podcast

00:00

Cost, quality and latency tradeoffs

Kyle frames hosting decisions around model choice, call frequency, input length, SLA, and cost-quality-latency tradeoffs.

Play episode from 33:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app