#29309
Mentioned in 2 episodes

AI Systems Performance Engineering

Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch
Book • 2025
AI Systems Performance Engineering equips professionals with actionable strategies to maximize efficiency across every layer of AI infrastructure.

The book provides step-by-step methodologies for fine-tuning GPU CUDA kernels, PyTorch-based algorithms, and multinode training and inference systems, along with techniques for scaling GPU clusters and implementing cutting-edge inference strategies.

It includes a 175+ item performance checklist covering the entire AI system lifecycle, from hardware planning and GPU programming to distributed training and efficient inference serving.

Mentioned by

Mentioned in 2 episodes

Mentioned by
undefined
Jon Krohn
and described by the author as his comprehensive O'Reilly book distilling GPU/CUDA/PyTorch performance engineering knowledge.
59 snips
973: AI Systems Performance Engineering, with Chris Fregly
Mentioned by
undefined
Chris Fregly
as his new O'Reilly book covering co-design across hardware, CUDA, PyTorch, and algorithms for AI performance.
58 snips
Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app