Monitoring policies

Monitoring policies

Librato Monitoring Settings

Variable

Value

Variable

Value

load_avg_1m

>1 for 10 min

router.service.median

>50ms for 10 min

router.service.perc95

>500ms for 10 min

router.status.5xx

>10 for 10 min

New Relic Monitoring Settings

Application Alert Policies

Apdex score

0.35 seconds (1.7 seconds in staging)

Alert policy

<0.94 apdex, >1% error rate for 10 minutes

Downtime

When unresponsive for 5 minutes

Setting the Apdex score: https://docs.newrelic.com/docs/apm/new-relic-apm/apdex/changing-your-apdex-settings

DynamoDB Monitoring

CloudWatch alarms.

Variable

Value

Variable

Value

Read throughput

0.8 (80%) for 5 or more more minutes

Write throughput

0.8 (80%) for 5 or more minutes

Throttled reads

50 or more in 5 minutes

Throttled writes

50 or more in 5 minutes

Rediscloud

Variable

Value

Variable

Value

Data store usage

>80%

Logentries / Papertrail

Variable

Value

Variable

Value

Papertrail 503 errors

10/hr

Logentries errors

10/hr

Logentries warnings

100/hr