
Razib Khan's Unsupervised Learning Daniel Tabin: ancient DNA, the good, bad and ugly
9 snips
Feb 28, 2026 Daniel Tabin, a Ph.D. student in David Reich's lab studying ancient DNA and population history of Central and East Asia. He examines data contamination and damage in paleogenomics. They discuss validation steps for ancient DNA, sequencing and variant-calling pitfalls, puzzling signals like Population Y, and surprising deep lineages across Eurasia.
AI Snips
Chapters
Transcript
Episode notes
How The Red Deer Cave Investigation Began
- Daniel Tabin recounts his long investigation into the Red Deer Cave sample that began after reading the original paper in 2020.
- He praises the original authors for sharing FastQ files and being collaborative despite disagreeing with their conclusions.
Multiple Layers Needed To Verify Ancient DNA
- Ancient DNA authenticity relies on multiple checks: direct dating, damage patterns (C->T at read ends), mitochondrial and sex-chromosome assessments.
- Tabin emphasizes damage concentrated at read ends and strand-specific patterns differ by library prep (single vs double stranded).
Avoid Majority Calling For Ancient Low Coverage Genomes
- When calling variants from low-coverage or heterogeneous ancient data, avoid naive majority voting and consider biases introduced by sequencing method differences.
- Reich lab often samples a single random read per site to reduce bias accumulation across heterozygous sites despite tradeoffs.
