Alert Configuration

This guide covers how to configure alerts effectively so you get notified about real problems without being overwhelmed by noise.

Set meaningful thresholds

For crons, the alert_threshold determines how many consecutive failures trigger a failure alert:

Threshold	Best for
`1`	Critical health checks where any failure requires immediate attention
`2`	Important crons where a single transient failure is acceptable
`3`	Non-critical crons where intermittent failures are expected

A threshold of 3 means you only get alerted when something is persistently broken, not when a single request times out due to a network blip.

Avoid over-alerting

Common mistakes that lead to alert fatigue:

Threshold of 1 on every cron — You get alerted on every transient failure. Reserve threshold 1 for truly critical monitors.
Alerts enabled on every queue — If you have high-volume queues, individual job failures may be normal. Enable alerts only on queues where failures are unexpected.
No maintenance windows — If you deploy frequently, temporary failures during deployment trigger alerts. Create maintenance windows for planned downtime.

Use warning alerts as early signals

Recuro sends a warning alert on the first failure in a sequence (before the threshold is reached). Use these as early signals in your dashboard without configuring them to notify externally. Review the Alerts page periodically to spot emerging patterns.

Combine alerts with success assertions

Instead of just alerting on HTTP errors, add success assertions to catch subtle failures:

{
  "success_assertions": [
    { "type": "status_code_equals", "value": "200" },
    { "type": "body_contains", "value": "\"healthy\":true" },
    { "type": "response_time_under_ms", "value": "5000" }
  ],
  "alert_threshold": 2
}

This catches cases where your endpoint returns 200 but the response indicates a degraded state or slow performance.

Separate notification channels by severity

If you have both critical and non-critical crons, consider using separate teams:

Production-critical team → Slack alerts to #oncall-alerts
Background tasks team → Email alerts (checked during business hours)

This prevents non-critical alerts from drowning out critical ones.

Use maintenance windows

Before planned downtime:

Go to Maintenance Windows
Create a window covering the deployment or maintenance period
Scope it to the affected cron or queue (or leave unscoped for all resources)

This suppresses alerts during the window, preventing false alarms.

Next steps

Setting Up Alerts — Configuration guide
Alerts — Full concept reference
Monitoring and Observability — Dashboards and success rates