TutorialsJanuary 31, 2026 10 min read

How to Monitor Cron Jobs: Step-by-Step Guide

Learn how to set up monitoring for your cron jobs. Get alerts when scheduled tasks fail, run too long, or don't run at all.

WizStatus Team
Author

Cron jobs are the backbone of automated tasks on Unix systems. But when they fail silently, you might not know until critical processes break. Here's how to set up reliable monitoring for all your cron jobs.

Why Monitor Cron Jobs?

Cron jobs fail silently. Common issues include:

  • Job never starts - Typo in crontab, wrong path
  • Job crashes - Runtime errors, missing dependencies
  • Job hangs - Infinite loops, deadlocks
  • Job runs but fails - Database errors, network issues
  • Server reboots - Cron doesn't run after reboot

Without monitoring, you discover these problems when users complain or data goes stale.

The most reliable approach: your cron job pings a monitoring service on completion.

Step 1: Get Your Ping URL

Create a heartbeat monitor in your monitoring service. You'll receive a unique URL like:

https://wizstatus.com/ping/abc123

Step 2: Modify Your Cron Job

Add a ping after successful execution:

Before:

0 2 * * * /home/user/backup.sh

After:

0 2 * * * /home/user/backup.sh && curl -fsS --retry 3 https://wizstatus.com/ping/abc123

The && ensures the ping only runs if the script succeeds.

Step 3: Configure Expected Schedule

In your monitoring dashboard:

  • Set schedule: "Daily at 2:00 AM"
  • Set grace period: 30-60 minutes (depending on job duration)

Step 4: Test the Setup

Run the job manually and verify:

  1. The ping appears in your monitoring dashboard
  2. Manually fail the job and confirm you get an alert

Method 2: Wrapper Script

For complex jobs, create a wrapper:

#!/bin/bash
# cron-wrapper.sh

PING_URL="$1"
shift
COMMAND="$@"

# Run the command
$COMMAND
EXIT_CODE=$?

# Ping only on success
if [ $EXIT_CODE -eq 0 ]; then
  curl -fsS --retry 3 "$PING_URL"
else
  echo "Job failed with exit code $EXIT_CODE"
fi

exit $EXIT_CODE

Usage:

0 2 * * * /home/user/cron-wrapper.sh https://wizstatus.com/ping/abc123 /home/user/backup.sh

Method 3: Email + Log Monitoring

Traditional but less reliable approach:

MAILTO=alerts@yourcompany.com
0 2 * * * /home/user/backup.sh 2>&1 | tee -a /var/log/backup.log

Drawbacks:

  • Email can be delayed or filtered
  • Doesn't catch jobs that don't run at all
  • Requires parsing logs

Cron Syntax Refresher

┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-6, Sunday=0)
│ │ │ │ │
* * * * * command

Common schedules:

* * * * *     # Every minute
0 * * * *     # Every hour
0 0 * * *     # Daily at midnight
0 2 * * *     # Daily at 2 AM
0 0 * * 0     # Weekly on Sunday
0 0 1 * *     # Monthly on the 1st

Monitoring Different Job Types

Backup Jobs

#!/bin/bash
# backup.sh

pg_dump production > /backup/db-$(date +%Y%m%d).sql

if [ $? -eq 0 ]; then
  # Verify backup file exists and has size
  if [ -s /backup/db-$(date +%Y%m%d).sql ]; then
    curl -fsS https://wizstatus.com/ping/backup-token
  fi
fi

Report Generation

#!/bin/bash
# daily-report.sh

python generate_report.py

if [ $? -eq 0 ] && [ -f /reports/daily-$(date +%Y%m%d).pdf ]; then
  curl -fsS https://wizstatus.com/ping/report-token
fi

Data Sync Jobs

#!/bin/bash
# sync.sh

rsync -avz /source/ /destination/
RSYNC_EXIT=$?

if [ $RSYNC_EXIT -eq 0 ]; then
  curl -fsS https://wizstatus.com/ping/sync-token
else
  echo "Rsync failed with code $RSYNC_EXIT"
fi

Queue Processors

For continuous processors, ping periodically:

import time
import requests

PING_URL = "https://wizstatus.com/ping/queue-token"
PING_INTERVAL = 300  # 5 minutes

last_ping = 0

while True:
    process_next_job()

    if time.time() - last_ping > PING_INTERVAL:
        requests.get(PING_URL)
        last_ping = time.time()

Handling Job Duration

For jobs that might exceed the grace period:

Start/End Pings

#!/bin/bash

# Ping start
curl -fsS https://wizstatus.com/ping/job-token/start

# Long running job
./long-backup-process.sh

# Ping complete
curl -fsS https://wizstatus.com/ping/job-token

Dynamic Grace Periods

Estimate job duration and set grace accordingly:

  • Short jobs (< 5 min): 10 minute grace
  • Medium jobs (5-30 min): 45 minute grace
  • Long jobs (30+ min): Job duration + 30 minutes

Common Issues and Solutions

Issue: Ping fails due to network

# Add retries and timeout
curl -fsS --retry 3 --retry-delay 10 --max-time 30 "$PING_URL"

Issue: Job runs as root but curl isn't found

Use full path:

0 2 * * * /path/to/backup.sh && /usr/bin/curl -fsS "$PING_URL"

Issue: Environment variables not available

Define in crontab or script:

PATH=/usr/local/bin:/usr/bin:/bin
0 2 * * * /path/to/backup.sh && curl -fsS "$PING_URL"

Issue: Job output clutters logs

Redirect appropriately:

0 2 * * * /path/to/job.sh > /var/log/job.log 2>&1 && curl -fsS "$PING_URL"

Testing Your Setup

Verify cron is running

systemctl status cron
# or
service cron status

Test job execution

# Run manually
/path/to/your/script.sh

# Check if ping was received
# (verify in monitoring dashboard)

Simulate failure

# Temporarily break the script
exit 1

# Verify no ping is sent
# Verify alert is triggered after grace period

Checklist

  • Identified all cron jobs to monitor
  • Created heartbeat monitors for each job
  • Matched monitor schedule to cron schedule
  • Set appropriate grace periods
  • Modified cron entries to ping on success
  • Tested successful execution pings
  • Tested failure scenarios
  • Set up notification channels
Never let a cron job fail silently again. Set up heartbeat monitoring and get alerts within minutes when jobs don't complete.

Related Articles

How to Monitor Backup Jobs and Get Alerts on Failure
Best Practices

How to Monitor Backup Jobs and Get Alerts on Failure

Set up reliable monitoring for your database and file backups. Get instant alerts when backup jobs fail, run too long, or don't run at all.
10 min read
Dead Man's Switch: Ensure Critical Jobs Never Fail Silently
Monitoring

Dead Man's Switch: Ensure Critical Jobs Never Fail Silently

Understand dead man's switch monitoring for critical systems. Learn how to implement fail-safe alerting for jobs that must run reliably.
9 min read
ETL Pipeline Monitoring: Detect Silent Failures
DevOps

ETL Pipeline Monitoring: Detect Silent Failures

Monitor your ETL pipelines with heartbeat checks. Get alerts when data pipelines fail, run too long, or produce unexpected results.
11 min read

Start monitoring your infrastructure today

Put these insights into practice with WizStatus monitoring.

Try WizStatus Free