
Data Renegades Ep. #2, Data Journalism Unleashed with Simon Willison
18 snips
Nov 25, 2025 Simon Willison, a prominent open-source software developer and data journalism advocate, shares his journey from creating Django to building Datasette. He discusses the evolution of data journalism and highlights impactful projects like the Washington Post's opioid investigation. The conversation explores how open-source tools can empower newsrooms, the potential of AI in automating data cleaning, and innovative uses of Datasette for diverse purposes. Simon also predicts a future where AI reshapes data workflows and enriches team capabilities.
AI Snips
Chapters
Books
Transcript
Episode notes
Make Data Cleaning Fast And Auditable
- Focus on making data-cleaning faster and verifiable rather than expecting long manual cleanups.
- Build tools and workflows that let reporters iterate quickly under tight deadlines.
Treat Data Models Like APIs
- Data documentation breakdowns cause costly mistakes because downstream users treat data like an undocumented API.
- Treat data models as APIs and document schema and changes for all downstream consumers.
Keep Docs In Repo And Enforce Updates
- Keep documentation close to the code and block changes that break docs during code review.
- This makes documentation trustworthy because contributors must update docs when they change behavior.






