
China’s AI Breakthrough, Time Crystals, Hidden Viruses, & Brightest Cosmic Signal (EP. 9)
From First Principles
00:00
DeepSeek's Training Tricks: Reinforcement Learning
Krishna explains DeepSeek's use of pure reinforcement learning to let the model autonomously develop reasoning strategies.
Play episode from 18:36
Transcript


