#30975
Mentioned in 2 episodes

Inference Engineering

AI Model Serving Optimization
Book • 0
Inference Engineering is a 300-page technical guide that maps the technologies and techniques powering inference across runtime, infrastructure, and tooling layers.

The book covers model architecture and optimization, GPU hardware specifications, software frameworks and inference engines, production optimization techniques including quantization and speculative decoding, and operational considerations for running AI models at scale.

It serves as a practical resource for engineers, executives, and technical leaders seeking to understand how to deploy and manage generative AI models efficiently.

Mentioned by

Mentioned in 2 episodes

Mentioned by Alex and Philip when introducing Philip's new book about running and engineering inference systems end-to-end.
94 snips
📅 ThursdAI - Feb 26 - The Pentagon wants War Claude, every benchmark collapsed, and a solo founder hit $700K ARR with AI agents
Mentioned by
undefined
Philip Kiely
as his recently published technical book about the practice of inference engineering and the inference stack.
87 snips
Inference engineering and the real-world deployment of LLMs, with Philip Kiely

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app