Definition
Latency measures the time delay in a system, typically the round-trip time from when a request is sent to when the first byte of the response is received. Network latency is affected by physical distance, network congestion, processing time, and the number of hops between source and destination. Low latency is critical for real-time applications, APIs, and user experience. Latency is usually measured in milliseconds (ms).
Examples
Latency Expectations by Use Case
Acceptable latency varies by application type.
| Application Type | Good Latency | Acceptable | Poor |
|--------------------|--------------|------------|---------|
| Web API | < 100ms | < 500ms | > 1s |
| Real-time gaming | < 50ms | < 100ms | > 150ms |
| Video call | < 150ms | < 300ms | > 500ms |
| File download | < 500ms | < 2s | > 5s |
| Database query | < 50ms | < 200ms | > 500ms |Latency Components
Breaking down total latency into components.
// Total latency breakdown
const latency = {
dnsLookup: 20, // DNS resolution
tcpConnect: 30, // TCP handshake
tlsHandshake: 50, // TLS negotiation
timeToFirstByte: 100, // Server processing
contentTransfer: 50, // Data transfer
// Total: 250ms
};Use Cases
API performance monitoring
User experience optimization
CDN effectiveness measurement
Database query optimization
Best Practices
- Monitor from locations near your users
- Track percentiles (p50, p95, p99), not just averages
- Set up alerts for latency degradation
- Use CDNs to reduce geographic latency
- Profile and optimize slow endpoints
FAQ
Put Latency Knowledge Into Practice
Start monitoring your infrastructure with WizStatus.
No credit card required • 20 free monitors forever