Two's Complement

Measure Twice, Optimize Once

36 snips
Mar 9, 2026
They dig into CPU-bound performance, contrasting throughput and latency with real-world examples like games and NLP. Benchmarking pitfalls and noisy results get practical fixes. Low-overhead instrumentation, sampling vs tracing profilers, and cache-aware analysis take center stage. Data-structure tradeoffs include a surprising linked-list use case and an order-book illustration.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Benchmarks Are Noisy So Reduce External Variability

  • Benchmarking is noisy because of cache, CPU frequency, OS interference and external factors.
  • Use dedicated machines, instruction counters (Valgrind) or CI performance graphs to reduce noise and catch regressions.
ADVICE

Count Instructions To Avoid Timing Noise

  • When time measurements vary, consider counting CPU instructions instead of wall-clock time.
  • Tools like Valgrind (instruction counts) and cache simulators help create stable CI checks for big regressions.
ADVICE

Make Microbenchmarks Representative Not Idealized

  • Ensure microbenchmarks use representative data; caches and branch predictors can hide real-world worst cases.
  • Inspect distributions (min, percentiles) not just the mean to understand realistic performance.
Get the Snipd Podcast app to discover more snips from this episode
Get the app