
Breaking the Memory Wall in the Age of Inference
The Data Exchange with Ben Lorica
00:00
SRAM vs HBM: Trade-offs for Inference
Sid contrasts SRAM and HBM, arguing HBM's cost, energy, and bandwidth limits make SRAM better for low-latency inference.
Play episode from 02:20
Transcript


