EP 162: TPUs Via Cloud Next, Intel Earnings, Foundry Scarcity

30 snips

Apr 27, 2026

A deep dive into Google’s new TPU v5p/v5i launches and the shift toward disaggregated training and inference silicon. A look at memory choices, HBM scarcity, SRAM inference ideas, and board/network moves to cut latency at scale. A surprising take on Intel’s strong earnings, CPU resurgence, foundry capacity and packaging backlogs. Debate over High‑NA EUV economics and whether the current semiconductor rally can last.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Memory And Latency Are The Inference Bottlenecks

Google emphasized memory and latency innovations on board and at the rack level to serve inference at scale.
They increased on‑chip RAM (north of 200Gb per chip using HBM3 plus SRAM) and built board/networking changes to reduce cross‑chip latency.

ANECDOTE

Google Compares Training To Web Indexing

Ben compared training then serving models to Google's early web indexing story to justify heavy training CapEx now.
The analogy: indexing (training) was costly but monetization occurred when serving the index (inference) at scale, which Google expects with Gemini.

INSIGHT

Intel's Quarter Moved It Past Existential Risk

Intel's latest quarter beat expectations and shifted the narrative from existential risk to normal operational questions like margins and capacity.
Strong CPU demand driven by AI (need to 'feed' GPUs) plus improving yields helped revenue and guidance beats.

Get the Snipd Podcast app to discover more snips from this episode

Get the app