Monitoring and Observability
This guide covers how to use Recuro’s built-in monitoring features to track the health of your crons and queues.
Dashboard overview
The Recuro dashboard (Dashboard in the sidebar) provides an at-a-glance view of your scheduling infrastructure:
- Cron stats — Total crons, active vs. paused, recent execution counts
- Queue stats — Total queues, active vs. inactive, job counts by status
- Alert summary — Unread alerts count and recent alert activity
- Activity chart — Executions and job runs over the last 7, 14, or 30 days
- Response time stats — Average and p95 response times across crons and queues
- Usage stats — Current plan, requests used vs. limit, projected monthly usage
Tracking success rates
Cron success rates
On the cron detail page, review the execution history:
- Look for patterns in failures (time of day, day of week)
- Check the
last_statusfield —completedorfailed - Monitor
consecutive_failures— if it keeps climbing, something is persistently wrong - Use the
alert_thresholdto get notified before the failure count gets out of hand
Queue success rates
On the queue detail page, review job statistics:
- Total jobs, completed, failed, pending, dead-lettered
- Response time trends (sparklines on the queue list page)
- DLQ depth — a growing DLQ indicates unresolved failures
Using completion callbacks for external monitoring
If you use an external monitoring system (Datadog, Grafana, PagerDuty), send execution results there via completion callbacks:
curl -X POST https://app.recurohq.com/api/crons \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "Payment Sync", "url": "https://api.yourapp.com/sync", "cron_expression": "0 * * * *", "callback_url": "https://monitoring.yourapp.com/recuro-webhook" }'Your callback endpoint receives a POST with the execution status, duration, and failure reason. Parse this to populate your monitoring dashboards.
Key metrics to watch
| Metric | Where to find it | What it tells you |
|---|---|---|
| Consecutive failures | Cron detail page | Persistent endpoint issues |
| DLQ depth | Dead Letter Queue page | Unresolved job failures |
| Response time (p95) | Dashboard, cron/queue detail | Endpoint performance degradation |
| Usage percentage | Usage page | How close you are to your plan limit |
| Unread alerts | Alerts page | Unaddressed issues |
| Projected usage | Usage page | Whether you will hit your limit this month |
Setting up proactive monitoring
- Set alert thresholds on all critical crons (threshold
1or2) - Enable queue alerts on queues that must succeed
- Configure notification channels (Slack for real-time, email for digest)
- Create maintenance windows for planned deployments
- Review the DLQ weekly and replay or purge stale jobs
- Check the Usage page periodically to avoid hitting hard limits
Next steps
- Setting Up Alerts — Configure alerting
- Viewing and Debugging Runs — Inspect execution details
- Alert Configuration — Avoid alert fatigue