
Software Engineering Radio - the podcast for professional software developers SE Radio 695: Dave Thomas on Building eBooks Infrastructure
Nov 19, 2025
Dave Thomas, co-founder of the Pragmatic Bookshelf and author of The Pragmatic Programmer, talks about building eBook infrastructure and tooling. He covers what eBooks and EPUBs are, the shift from MOBI and PDFs to reflowable formats, why plain-text and Markdown won out, pipeline architecture and preprocessing, conversion to EPUB/PDF, CI builds, handling code and cross-references, and the role of humans plus AI in publishing.
AI Snips
Chapters
Books
Transcript
Episode notes
Pragmatic Programmer Started From Plain Text HTML
- Dave and Andy built The Pragmatic Programmer from plain-text HTML notes and convinced Addison Wesley to accept camera-ready PostScript/PDF instead of Word files.
- They avoided Word/FrameMaker workflows and produced master PDFs from LaTeX so authors could update and resend camera-ready output.
Single Semantic Source Powers Multiple Outputs
- Using a plain-text semantic markup pipeline gives authors control and preserves a single source that can generate PDF, EPUB, and other outputs.
- Dave describes their move from Troff to TeX/LaTeX and later XML/Markdown to keep semantic structure and reuse tooling for multiple formats.
Write In Semantic Markup Not WYSIWYG
- Avoid WYSIWYG for book authoring; use semantic plain-text markup so authors focus on content, not layout.
- Dave says authors write in PML/Markdown while the publisher handles final layout, preventing distractions over margins and fonts.














