Skip to content

Monitoring and Observability

This guide covers how to use Recuro’s built-in monitoring features to track the health of your crons and queues.

Dashboard overview

The Recuro dashboard (Dashboard in the sidebar) provides an at-a-glance view of your scheduling infrastructure:

  • Cron stats — Total crons, active vs. paused, recent execution counts
  • Queue stats — Total queues, active vs. inactive, job counts by status
  • Alert summary — Unread alerts count and recent alert activity
  • Activity chart — Executions and job runs over the last 7, 14, or 30 days
  • Response time stats — Average and p95 response times across crons and queues
  • Usage stats — Current plan, requests used vs. limit, projected monthly usage

Tracking success rates

Cron success rates

On the cron detail page, review the execution history:

  • Look for patterns in failures (time of day, day of week)
  • Check the last_status field — completed or failed
  • Monitor consecutive_failures — if it keeps climbing, something is persistently wrong
  • Use the alert_threshold to get notified before the failure count gets out of hand

Queue success rates

On the queue detail page, review job statistics:

  • Total jobs, completed, failed, pending, dead-lettered
  • Response time trends (sparklines on the queue list page)
  • DLQ depth — a growing DLQ indicates unresolved failures

Using completion callbacks for external monitoring

If you use an external monitoring system (Datadog, Grafana, PagerDuty), send execution results there via completion callbacks:

Terminal window
curl -X POST https://app.recurohq.com/api/crons \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Payment Sync",
"url": "https://api.yourapp.com/sync",
"cron_expression": "0 * * * *",
"callback_url": "https://monitoring.yourapp.com/recuro-webhook"
}'

Your callback endpoint receives a POST with the execution status, duration, and failure reason. Parse this to populate your monitoring dashboards.

Key metrics to watch

MetricWhere to find itWhat it tells you
Consecutive failuresCron detail pagePersistent endpoint issues
DLQ depthDead Letter Queue pageUnresolved job failures
Response time (p95)Dashboard, cron/queue detailEndpoint performance degradation
Usage percentageUsage pageHow close you are to your plan limit
Unread alertsAlerts pageUnaddressed issues
Projected usageUsage pageWhether you will hit your limit this month

Setting up proactive monitoring

  1. Set alert thresholds on all critical crons (threshold 1 or 2)
  2. Enable queue alerts on queues that must succeed
  3. Configure notification channels (Slack for real-time, email for digest)
  4. Create maintenance windows for planned deployments
  5. Review the DLQ weekly and replay or purge stale jobs
  6. Check the Usage page periodically to avoid hitting hard limits

Next steps