The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Learning Active Learning with Ksenia Konyushkova - TWiML Talk #116

Mar 5, 2018

Ksenia Konyushkova, PhD student at EPFL focusing on active learning and annotation-efficiency. She discusses training models to choose which unlabeled examples most improve learning. Short simulations and real-data tests, strategies for hard checkerboard-like problems, smart bounding-box and segmentation workflows, and ways to speed human annotation with adaptive dialogs and batching.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Learned Active Learning Outperforms Fixed Heuristics

Active learning can be learned from data by training a regressor to predict which unlabeled point will most reduce future error.
Ksenia simulates many synthetic 2D datasets, records classifier state and point features, then trains a model to pick points that maximize error reduction.

INSIGHT

Distance To Boundary Is Not Always The Best Criterion

The learned selector uses many features beyond distance-to-boundary; classifier certainty and point density matter more.
Ksenia found distance-to-boundary ranked fourth while classifier confidence was the top feature in her regressor analysis.

INSIGHT

Adaptive Exploration Then Exploitation Beats Uncertainty Sampling

A learned strategy adapts: when classifier is very uncertain it explores randomly, later it refines the boundary.
In XOR/checkerboard problems uncertainty sampling can perform worse than random, but the learned policy discovers exploration-first behavior.

Get the Snipd Podcast app to discover more snips from this episode

Get the app