Two's Complement

Measure Twice, Optimize Once

36 snips

Mar 9, 2026

They dig into CPU-bound performance, contrasting throughput and latency with real-world examples like games and NLP. Benchmarking pitfalls and noisy results get practical fixes. Low-overhead instrumentation, sampling vs tracing profilers, and cache-aware analysis take center stage. Data-structure tradeoffs include a surprising linked-list use case and an order-book illustration.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Benchmarks Are Noisy So Reduce External Variability

Benchmarking is noisy because of cache, CPU frequency, OS interference and external factors.
Use dedicated machines, instruction counters (Valgrind) or CI performance graphs to reduce noise and catch regressions.

ADVICE

Count Instructions To Avoid Timing Noise

When time measurements vary, consider counting CPU instructions instead of wall-clock time.
Tools like Valgrind (instruction counts) and cache simulators help create stable CI checks for big regressions.

ADVICE

Make Microbenchmarks Representative Not Idealized

Ensure microbenchmarks use representative data; caches and branch predictors can hide real-world worst cases.
Inspect distributions (min, percentiles) not just the mean to understand realistic performance.

Get the Snipd Podcast app to discover more snips from this episode