ThursdAI - The top AI news from the past week

📆 Open source just pulled up to Opus 4.6 — at 1/20th the price

99 snips
Feb 13, 2026
Olive Song, senior reinforcement learning researcher at MiniMax who led the M2.5 release and the Forge RL framework, joins to explain MiniMax M2.5’s 80.2% SWE‑Bench result and compact 10B active‑param design. Conversation covers RL training strategies, speed and efficiency optimizations, open‑source tradeoffs, and how agentic workflows speed iteration. Quick, technical and forward‑looking.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ADVICE

Build SOPs For Agentic Workflows

  • Design models and prompts for agentic workflows by defining processes once as repeatable SOPs.
  • Use memory and tool calls so agents can perform multi-step tasks and learn user habits over time.
INSIGHT

Small Model, Big Results With RL

  • Minimax M2.5 achieves 80.2% on SWE-bench Verified with a 10B active-parameter model using heavy RL.
  • Olive attributes the leap to their decoupled RL framework called Forge focused on end-to-end task efficiency.
INSIGHT

Optimize For End-To-End Task Cost

  • Minimax optimizes not just correctness but the end-to-end time and tool usage of tasks to reduce cost.
  • Olive emphasized training on diverse agent environments so samples don't interfere and scale smoothly.
Get the Snipd Podcast app to discover more snips from this episode
Get the app