Optimizing Latency in Microservices
Peter Lawrey’s talk breaks down how to engineer microservices that consistently respond in microseconds rather than milliseconds.
Link
Key Takeaways
- Prefer mechanical sympathy: align data structures with CPU cache lines and avoid false sharing.
- Use binary protocols and preallocated buffers to eliminate GC pressure during hot paths.
- Treat the network like a critical dependency—pin threads, batch syscalls, and monitor queue depths.
Recommended Actions
- Benchmark your service using matching payload sizes and concurrency; synthetic tests often mislead.
- Instrument the 99.9th percentile latency and trace hop-by-hop to locate jitter.
- Document predictable fallback behaviour (timeouts, retry budgets) so partial failures do not cascade.