glossary.categories.monitoringAcronym

MTTD

Mean Time To Detect

Also known as: Mean Time To DiscoveryDetection Time

Mean Time To Detect - the average time to identify that an issue has occurred.

Definition

Mean Time To Detect (MTTD) measures the average time between when a failure occurs and when it is detected by monitoring systems or users. MTTD is critical because you can't fix what you don't know is broken. Reducing MTTD through better monitoring directly improves MTTR and overall system availability. Effective synthetic monitoring, alerting, and observability practices are key to minimizing MTTD.

Examples

MTTD Impact on Total Downtime

How detection time affects total incident duration.

// Total incident time breakdown
const incident = {
  failureTime: '10:00:00',      // When failure occurred
  detectionTime: '10:15:00',    // When detected (MTTD: 15 min)
  diagnosisTime: '10:25:00',    // When root cause found
  recoveryTime: '10:45:00',     // When service restored

  mttd: 15,     // minutes
  mttr: 30,     // minutes (detection to recovery)
  totalDowntime: 45  // minutes
};

// 33% of downtime was detection time!
// Better monitoring could reduce MTTD to < 1 minute

Use Cases

Evaluating monitoring system effectiveness
Optimizing alert thresholds and conditions
Justifying monitoring infrastructure investment
Incident post-mortem analysis

Best Practices

  • Use synthetic monitoring with frequent checks (30-60 seconds)
  • Monitor from multiple locations to avoid false negatives
  • Set up multi-channel alerting for redundancy
  • Implement anomaly detection for proactive alerts
  • Regularly test monitoring systems

FAQ

Put MTTD Knowledge Into Practice

Start monitoring your infrastructure with WizStatus.

No credit card required • 20 free monitors forever