Docker containers can fail silently. A container might be running but not actually serving requests. Here's how to set up proper health monitoring for your Docker infrastructure.
Docker Native Health Checks
Adding HEALTHCHECK to Dockerfile
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN npm install
# Health check configuration
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
CMD ["npm", "start"]
Health Check Parameters
| Parameter | Description | Default |
|---|---|---|
--interval | Time between checks | 30s |
--timeout | Max time for check | 30s |
--start-period | Grace period at startup | 0s |
--retries | Failures before unhealthy | 3 |
Checking Container Health Status
# View health status
docker inspect --format='{{.State.Health.Status}}' container_name
# Watch health checks in real-time
docker events --filter event=health_status
# List all containers with health status
docker ps --format "table {{.Names}}\t{{.Status}}"
Health Check Endpoints
Simple HTTP Health Check
// Express.js
app.get('/health', (req, res) => {
res.status(200).json({ status: 'healthy' });
});
Comprehensive Health Check
app.get('/health', async (req, res) => {
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
checks: {}
};
// Check database
try {
await db.query('SELECT 1');
health.checks.database = 'ok';
} catch (err) {
health.checks.database = 'fail';
health.status = 'unhealthy';
}
// Check Redis
try {
await redis.ping();
health.checks.redis = 'ok';
} catch (err) {
health.checks.redis = 'fail';
health.status = 'unhealthy';
}
const statusCode = health.status === 'healthy' ? 200 : 503;
res.status(statusCode).json(health);
});
Docker Compose Health Checks
version: '3.8'
services:
web:
build: .
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
depends_on:
db:
condition: service_healthy
db:
image: postgres:15
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
Resource Monitoring
Monitor Container Resources
# Real-time stats
docker stats
# Stats for specific containers
docker stats web db redis
# One-time snapshot
docker stats --no-stream
Key Metrics to Track
| Metric | Warning Threshold | Critical Threshold |
|---|---|---|
| CPU Usage | >70% | >90% |
| Memory Usage | >80% | >95% |
| Network I/O | Varies | Sudden spikes |
| Block I/O | Varies | Sustained high |
External Monitoring Setup
Monitor Container Endpoints
Set up HTTP monitoring for your container endpoints:
- Health endpoint -
https://app.example.com/health - API endpoints - Critical business endpoints
- Database proxies - If exposed
Container Restart Alerts
Create a script to detect container restarts:
#!/bin/bash
# container-watch.sh
CONTAINER="my-app"
LAST_STARTED=""
while true; do
STARTED=$(docker inspect --format='{{.State.StartedAt}}' $CONTAINER)
if [ "$LAST_STARTED" != "" ] && [ "$STARTED" != "$LAST_STARTED" ]; then
# Container restarted - send alert
curl -X POST "https://your-webhook-url" \
-d "{\"text\": \"Container $CONTAINER restarted\"}"
fi
LAST_STARTED=$STARTED
sleep 60
done
Logging and Alerting
Centralize Container Logs
# docker-compose.yml
services:
web:
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
Log-Based Alerts
Monitor logs for error patterns:
# Watch for errors in real-time
docker logs -f container_name 2>&1 | grep -i error
Container Monitoring Checklist
- HEALTHCHECK added to Dockerfiles
- Health endpoints implemented
- Docker Compose healthchecks configured
- External monitoring set up
- Resource alerts configured
- Log aggregation enabled
- Restart alerts in place
- Dependencies health-checked
- Alerting channels configured
- Dashboard created
Best Practices
Health Check Design
- Keep health checks fast (<1s)
- Check all critical dependencies
- Return appropriate HTTP status codes
- Include diagnostic info in response
- Don't health check external services
Monitoring Strategy
- Monitor at multiple levels (container, service, infrastructure)
- Set up appropriate thresholds
- Configure escalation policies
- Document your monitoring setup
WizStatus can monitor your Docker container health endpoints with 1-minute intervals. Get instant alerts when containers become unhealthy, with detailed response diagnostics.