From First Principles cover image

China’s AI Breakthrough, Time Crystals, Hidden Viruses, & Brightest Cosmic Signal (EP. 9)

From First Principles

00:00

DeepSeek's Training Tricks: Reinforcement Learning

Krishna explains DeepSeek's use of pure reinforcement learning to let the model autonomously develop reasoning strategies.

Play episode from 18:36
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app