How to Monitor Docker Container Health

Docker containers can fail silently. A container might be running but not actually serving requests. Here's how to set up proper health monitoring for your Docker infrastructure.

Docker Native Health Checks

Adding HEALTHCHECK to Dockerfile

dockerfile

FROM node:18-alpine

WORKDIR /app
COPY . .
RUN npm install

# Health check configuration
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

CMD ["npm", "start"]

Health Check Parameters

Parameter	Description	Default
`--interval`	Time between checks	30s
`--timeout`	Max time for check	30s
`--start-period`	Grace period at startup	0s
`--retries`	Failures before unhealthy	3

Checking Container Health Status

bash

# View health status
docker inspect --format='{{.State.Health.Status}}' container_name

# Watch health checks in real-time
docker events --filter event=health_status

# List all containers with health status
docker ps --format "table {{.Names}}\t{{.Status}}"

Health Check Endpoints

Simple HTTP Health Check

javascript

// Express.js
app.get('/health', (req, res) => {
  res.status(200).json({ status: 'healthy' });
});

Comprehensive Health Check

javascript

app.get('/health', async (req, res) => {
  const health = {
    status: 'healthy',
    timestamp: new Date().toISOString(),
    checks: {}
  };

  // Check database
  try {
    await db.query('SELECT 1');
    health.checks.database = 'ok';
  } catch (err) {
    health.checks.database = 'fail';
    health.status = 'unhealthy';
  }

  // Check Redis
  try {
    await redis.ping();
    health.checks.redis = 'ok';
  } catch (err) {
    health.checks.redis = 'fail';
    health.status = 'unhealthy';
  }

  const statusCode = health.status === 'healthy' ? 200 : 503;
  res.status(statusCode).json(health);
});

Docker Compose Health Checks

yaml

version: '3.8'

services:
  web:
    build: .
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:15
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

Resource Monitoring

Monitor Container Resources

bash

# Real-time stats
docker stats

# Stats for specific containers
docker stats web db redis

# One-time snapshot
docker stats --no-stream

Key Metrics to Track

Metric	Warning Threshold	Critical Threshold
CPU Usage	>70%	>90%
Memory Usage	>80%	>95%
Network I/O	Varies	Sudden spikes
Block I/O	Varies	Sustained high

External Monitoring Setup

Monitor Container Endpoints

Set up HTTP monitoring for your container endpoints:

Health endpoint - https://app.example.com/health
API endpoints - Critical business endpoints
Database proxies - If exposed

Container Restart Alerts

Create a script to detect container restarts:

bash

#!/bin/bash
# container-watch.sh

CONTAINER="my-app"
LAST_STARTED=""

while true; do
  STARTED=$(docker inspect --format='{{.State.StartedAt}}' $CONTAINER)

  if [ "$LAST_STARTED" != "" ] && [ "$STARTED" != "$LAST_STARTED" ]; then
    # Container restarted - send alert
    curl -X POST "https://your-webhook-url" \
      -d "{\"text\": \"Container $CONTAINER restarted\"}"
  fi

  LAST_STARTED=$STARTED
  sleep 60
done

Logging and Alerting

Centralize Container Logs

yaml

# docker-compose.yml
services:
  web:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Log-Based Alerts

Monitor logs for error patterns:

bash

# Watch for errors in real-time
docker logs -f container_name 2>&1 | grep -i error

Container Monitoring Checklist

HEALTHCHECK added to Dockerfiles
Health endpoints implemented
Docker Compose healthchecks configured
External monitoring set up
Resource alerts configured
Log aggregation enabled
Restart alerts in place
Dependencies health-checked
Alerting channels configured
Dashboard created

Best Practices

Health Check Design

Keep health checks fast (<1s)
Check all critical dependencies
Return appropriate HTTP status codes
Include diagnostic info in response
Don't health check external services

Monitoring Strategy

Monitor at multiple levels (container, service, infrastructure)
Set up appropriate thresholds
Configure escalation policies
Document your monitoring setup

WizStatus can monitor your Docker container health endpoints with 1-minute intervals. Get instant alerts when containers become unhealthy, with detailed response diagnostics.

How to Monitor Docker Container Health

Docker Native Health Checks

Adding HEALTHCHECK to Dockerfile

Health Check Parameters

Checking Container Health Status

Health Check Endpoints

Simple HTTP Health Check

Comprehensive Health Check

Docker Compose Health Checks

Resource Monitoring

Monitor Container Resources

Key Metrics to Track

External Monitoring Setup

Monitor Container Endpoints

Container Restart Alerts

Logging and Alerting

Centralize Container Logs

Log-Based Alerts

Container Monitoring Checklist

Best Practices

Health Check Design

Monitoring Strategy

Related Articles

Alert Fatigue Prevention: Strategies for Effective Monitoring

Chaos Engineering Monitoring: Measure Resilience in Action

CI/CD Pipeline Monitoring: Ensure Fast, Reliable Deployments

Start monitoring your infrastructure today