OpenAI Podcast cover image

Episode 18 - Why AI needs a new kind of supercomputer network

OpenAI Podcast

00:00

Failures at scale and their impact on training

They discuss frequent link/switch failures in huge networks and how transient glitches can halt synchronous training jobs.

Play episode from 11:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app