glossary.categories.performance

Latency

Also known as: Response TimeDelayLag

The time delay between a request and the beginning of a response.

Definition

Latency measures the time delay in a system, typically the round-trip time from when a request is sent to when the first byte of the response is received. Network latency is affected by physical distance, network congestion, processing time, and the number of hops between source and destination. Low latency is critical for real-time applications, APIs, and user experience. Latency is usually measured in milliseconds (ms).

Examples

Latency Expectations by Use Case

Acceptable latency varies by application type.

| Application Type    | Good Latency | Acceptable | Poor    |
|--------------------|--------------|------------|---------|
| Web API            | < 100ms      | < 500ms    | > 1s    |
| Real-time gaming   | < 50ms       | < 100ms    | > 150ms |
| Video call         | < 150ms      | < 300ms    | > 500ms |
| File download      | < 500ms      | < 2s       | > 5s    |
| Database query     | < 50ms       | < 200ms    | > 500ms |

Latency Components

Breaking down total latency into components.

// Total latency breakdown
const latency = {
  dnsLookup: 20,      // DNS resolution
  tcpConnect: 30,     // TCP handshake
  tlsHandshake: 50,   // TLS negotiation
  timeToFirstByte: 100, // Server processing
  contentTransfer: 50,  // Data transfer
  // Total: 250ms
};

Use Cases

API performance monitoring
User experience optimization
CDN effectiveness measurement
Database query optimization

Best Practices

  • Monitor from locations near your users
  • Track percentiles (p50, p95, p99), not just averages
  • Set up alerts for latency degradation
  • Use CDNs to reduce geographic latency
  • Profile and optimize slow endpoints

FAQ

Put Latency Knowledge Into Practice

Start monitoring your infrastructure with WizStatus.

No credit card required • 20 free monitors forever