
Two's Complement Measure Twice, Optimize Once
36 snips
Mar 9, 2026 They dig into CPU-bound performance, contrasting throughput and latency with real-world examples like games and NLP. Benchmarking pitfalls and noisy results get practical fixes. Low-overhead instrumentation, sampling vs tracing profilers, and cache-aware analysis take center stage. Data-structure tradeoffs include a surprising linked-list use case and an order-book illustration.
AI Snips
Chapters
Transcript
Episode notes
Benchmarks Are Noisy So Reduce External Variability
- Benchmarking is noisy because of cache, CPU frequency, OS interference and external factors.
- Use dedicated machines, instruction counters (Valgrind) or CI performance graphs to reduce noise and catch regressions.
Count Instructions To Avoid Timing Noise
- When time measurements vary, consider counting CPU instructions instead of wall-clock time.
- Tools like Valgrind (instruction counts) and cache simulators help create stable CI checks for big regressions.
Make Microbenchmarks Representative Not Idealized
- Ensure microbenchmarks use representative data; caches and branch predictors can hide real-world worst cases.
- Inspect distributions (min, percentiles) not just the mean to understand realistic performance.
