Reinforcement Learning and Synthetic Data

Tudor explains using RL and verifier feedback to generate synthetic proofs and improve pattern recognition.

Play episode from 41:57

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Tudor Achim is the co-founder and CEO of Harmonic, a startup working to solve one of AI’s hardest problems: mathematical reasoning. In July 2024, Harmonic achieved gold-medal-level performance on International Math Olympiad problems alongside systems from OpenAI and Google DeepMind—but with a key difference: every proof Harmonic submitted was formally verified. Tudor's path to Harmonic wound through competitive piano, computational biology, and autonomous driving. He studied at Carnegie Mellon's music preparatory school, worked on machine learning at Quora, briefly pursued a PhD before dropping out, and then co-founded an autonomous driving company, Helm.ai. Harmonic's core product, Aristotle, uses reinforcement learning and the programming language Lean 4 to solve problems and verify solutions.

In our conversation, we explore:

Why Tudor believes math is the fundamental toolkit to understand the world
How Harmonic uses hallucinations as a feature, not a bug
How Aristotle works and the applications beyond pure mathematics
The reinforcement learning process that lets Harmonic generate synthetic training data and solve problems humans have never attempted
Why Tudor believes AI could surpass human mathematicians on specific tasks within 2–3 years
Why the future of mathematics looks more like GitHub than academic journals
The alternating pattern between intellect leaps and data leaps throughout scientific history
How studying piano under an extraordinary teacher taught Tudor discipline and the value of sticking with hard problems

—

Thank you to the partners who make this possible

Brex: The intelligent finance platform.

Guru: The AI source of truth for work.

Rippling: Stop wasting time on admin tasks, build your startup faster.

—

Transcript: https://www.generalist.com/p/how-a-20-person-startup-won-gold

—

Timestamps

(00:00) Intro

(03:34) From competitive piano to computer science

(06:28) The mathematical foundations of music (and why Tudor keeps them separate)