Science Quickly

Can AI do math, or does it just act like a calculator?

9 snips
Mar 25, 2026
Joe Howlett, a Science and Technology reporter covering mathematics, discusses whether AI can handle real research-level proofs. He outlines a community challenge that tests models on unpublished lemmas. They compare early results, how AI proofs differ from human proofs, and what future rounds might reveal about AI’s role in advancing math.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Research Math Is About Proofs Not Answers

  • Research math focuses on proving statements about abstract objects, not numeric homework-style answers.
  • Joe Howlett contrasts homework problems with proving theorems about high-dimensional shapes and their properties in research math.
INSIGHT

Contest Wins Don’t Equal Research Ability

  • Benchmarks like IMO gold show model skill on contest-style problems but don't prove ability to do original research.
  • Howlett notes contest wins resemble homework tasks, not the open-ended creation and proof of new mathematical concepts.
ANECDOTE

Researchers Used Unpublished Lemmas As Tests

  • Eleven mathematicians extracted unpublished lemmas from their own upcoming papers and posed them to LLMs to avoid training-data leakage.
  • Each lemma came from a real research proof and was withheld from online posting prior to the challenge.
Get the Snipd Podcast app to discover more snips from this episode
Get the app