Librato Monitoring Settings
Variable | Value |
---|
load_avg_1m | >1 for 10 min |
router.service.median | >50ms for 10 min |
router.service.perc95 | >500ms for 10 min |
router.status.5xx | >10 for 10 min |
New Relic Monitoring Settings
Application Alert Policies
Apdex score | 0.35 seconds (1.7 seconds in staging) |
Alert policy | <0.94 apdex, >1% error rate for 10 minutes |
Downtime | When unresponsive for 5 minutes |
Setting the Apdex score: https://docs.newrelic.com/docs/apm/new-relic-apm/apdex/changing-your-apdex-settings
DynamoDB Monitoring
CloudWatch alarms.
Variable | Value |
---|
Read throughput | 0.8 (80%) for 5 or more more minutes |
Write throughput | 0.8 (80%) for 5 or more minutes |
Throttled reads | 50 or more in 5 minutes |
Throttled writes | 50 or more in 5 minutes |
Rediscloud
Variable | Value |
---|
Data store usage | >80% |
Logentries / Papertrail
Variable | Value |
---|
Papertrail 503 errors | 10/hr |
Logentries errors | 10/hr |
Logentries warnings | 100/hr |