
Elixir Wizards Telemetry & Observability for Elixir Apps at Cars.com with Zack Kayser & Ethan Gunderson
7 snips
Dec 12, 2024 Ethan Gunderson, Principal Software Engineer focused on performance and observability, and Zack Kayser, Senior Software Engineer experienced in running large-scale Elixir systems, discuss telemetry and observability at Cars.com. They cover scaling Elixir for high traffic, tracing and spans, LiveView WebSocket challenges, sampling and storage trade-offs, OpenTelemetry adoption, and practical steps for getting telemetry working in apps.
AI Snips
Chapters
Books
Transcript
Episode notes
Real Traffic Numbers At Cars.com
- Cars.com routinely handles hundreds of millions and has hit billions of requests per day across HTTP and WebSocket traffic.
- Zack reports 100,000 to 200,000 open WebSocket connections in quiet mornings, illustrating LiveView scale challenges.
Better To Have Too Much Telemetry Than Too Little
- When incidents occur, favor too much telemetry over too little because missing data blocks root-cause analysis.
- Zack explains they'd rather have massive traces to dig through than no traces at all during an outage.
Sample Traces But Preserve Rare Feature Signals
- Do sample high-volume traces but be mindful of naive sampling that drops rare-feature visibility.
- Ethan says Cars samples ~1% and warns random sampling can filter out low-usage features you still need to inspect.



