Oxide and Friends

When Nine Nines Isn't Enough

38 snips
Mar 18, 2026
Robert "RFK" Keith, Oxide hardware engineer who performed lab surgery and physical debugging. Nathanael Huffman, power and quality engineer who analyzed 12V/IBC behavior. They recount chasing elusive hardware resets, capturing 12V droops with scopes, instrumenting hot‑plug controllers, and working with the IBC maker to reprogram undervoltage logic. Fast-paced troubleshooting, manufacturing tests, and fleet mitigations.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Instrument To Collect Disconfirming Evidence

  • When a bug is rare and non-reproducible, ask what additional data you wish you had and instrument to collect it.
  • The team added PCIe hot-plug counters and new reports to disconfirm hypotheses rather than rely on speculation.
ANECDOTE

MAX5970 Logged Shocking 12V Droops To 8V

  • MAX5970 hot-plug controllers reported 12V minima around 8V during the droop events, which shocked the team.
  • Bryan and Nathanael realized drives tolerate down to ~8V, explaining why CPU stayed up while drives reset.
INSIGHT

IBC Is Single Point And Drives Are Most Sensitive

  • The 54V-to-12V intermediate bus converter (IBC) is the single point feeding nearly all sled power, so a short 12V droop can affect downstream devices differently.
  • The CPU stayed up because its DC-DC converters tolerate the lower input briefly, while the drives (and MAX5970) were most sensitive.
Get the Snipd Podcast app to discover more snips from this episode
Get the app