
Perplexity AI Microsoft Reveals Maya 200 AI Inference Chip
6 snips
Jan 26, 2026 They dig into Microsoft’s new Maya 200 inference chip and what makes it a purpose-built accelerator. Discussion covers its performance specs at different precisions and why custom silicon matters for cost and control. Conversation compares cloud providers racing to build their own chips and how that shifts data center efficiency and long-term strategy.
AI Snips
Chapters
Transcript
Episode notes
Maya 200 Targets Inference At Scale
- Microsoft introduced the Maya 200 as a custom AI inference accelerator built for production scale.
- It delivers up to 10 PFLOPS at 4-bit and ~5 PFLOPS at 8-bit to run large language models efficiently.
Maya 200 Builds On Maya 100
- Jaeden notes Maya 200 succeeds the Maya 100, launched in 2023 as Microsoft's first serious in-house AI chip.
- He frames the 200 as a major step forward in performance and tighter integration with Microsoft's cloud.
Inference Is The Hidden Cost Driver
- Jaeden explains inference is becoming a dominant cost due to millions of users generating outputs constantly.
- Small efficiency gains at the chip level translate into major cloud-scale cost savings.
