
Data Skeptic Fairness in PCA-Based Recommenders
Jan 26, 2026
David Liu, assistant research professor at Cornell focused on fairness in ML and recommenders, discusses PCA-based recommenders and why they can ignore niche and minority users. He introduces power niche users and explains how PCA over-specializes while proposing item-weighted PCA and upweighting strategies. The chat covers tradeoffs, evaluation, scalability, and the need for better datasets.
AI Snips
Chapters
Transcript
Episode notes
Meta Internship Revealed Scale Challenges
- David Liu described his Meta internship working on friendship recommendations to learn scale challenges.
- He noted even simple models become computationally heavy at billion-user scale.
PCA Focuses On Dense Data Regions
- PCA optimizes overall approximation of the interaction matrix and therefore favors regions with most data.
- This causes underrepresentation of niche groups because the global best fit ignores sparse regions.
Popular Items Can Be Locked In
- PCA can over-specialize on popular items and essentially memorize existing listeners.
- That specialization prevents discovering new potential fans who haven't yet listened.
