Document toolboxDocument toolbox

Logging and Monitoring of the Service Endpoints

Heroku logs

Heroku logs are a stream of events which are collated and routed by Logplex.  There are 3 types: API logs, system logs, and app logs.

Actions about the stack via the Heroku API are tracked by the API logs.  Example:

Sep 02 14:59:03 bridge-prod heroku/api: Deploy f158691 by xyz@abc.org

System logs are events of the Heroku stack.  Of particular interest is the router log:

Sep 03 14:24:20 bridge-prod heroku/router: at=info method=GET path="/?study=api" host=webservices.sagebridge.org
 request_id=4f839190-0943-8a41-9081-5b9faa862f31 fwd="173.194.33.105" dyno=web.5 connect=4ms service=7ms status=200 bytes=509

App logs are events of the Bridge app.  Besides the free text dumped by the app logger, Bridge keeps structured metrics in JSON format:

Sep 03 14:44:04.284 bridge-prod app/web.6: 2015-09-03 14:44:03,859 INFO [application-akka.actor.default-dispatcher-1727]
 org.sagebionetworks.bridge.play.interceptors.MetricsInterceptor -
{
  "version": 1,
  "start": "2015-09-03T21:44:02.488Z",
  "request_id": "3460b989-ba07-4333-ab38-dc687db1469c",
  "method": "POST",
  "uri": "\/v3\/auth\/signIn",
  "protocol": "HTTP\/1.1",
  "remote_address": "74.23.180.29",
  "user_agent": "Integration Tests (Linux\/3.13.0-36-generic) BridgeJavaSDK\/3",
  "status": 200,
  "end": "2015-09-03T21:44:03.992Z"
}

 

Notice that the request ID is recorded across different types of logs.  It can used to trace events of a particular request.

Consumers of the Heroku logs

Logentries

Heroku logs are drained to the Heroku add-on Logentries.  Logentries archives the logs permanently to S3 and alerts us on high error rates.  For details on alert setup, see the page Monitoring policies.

Librato

Librato monitors stack performance by analyzing the Heroku API logs and system logs.

TODO:  BRIDGE-783 - Getting issue details... STATUS

TODO:  BRIDGE-784 - Getting issue details... STATUS

(Note Librato was introduced as a partial alternative to New Relic.  New Relic has since been disabled as it does not support Play 2.4.x.)

Redshift

Currently a manual process is carried out regularly to export the structured metrics to Redshift.  There the logs can join other tables for the purpose of answering business questions, performing audits, and to a lesser extent, monitoring the system.

TODO:  BRIDGE-500 - Getting issue details... STATUS

Logging and monitoring on AWS resources

We have CloudWatch alerts set up on DynamoDB metrics.  For details, see the page Monitoring policies.

Access logging are enabled for the S3 buckets that store the upload data.  Currently the logs are captured but not consumed.

TODO:  BRIDGE-785 - Getting issue details... STATUS

Another potentially useful service is CloudTrail which captures access to the AWS resources within the account.

TODO:  BRIDGE-371 - Getting issue details... STATUS