Logging and Monitoring of the Service Endpoints
Heroku logs
Heroku logs are a stream of events which are collated and routed by Logplex. There are 3 types: API logs, system logs, and app logs.
Actions about the stack via the Heroku API are tracked by the API logs. Example:
Sep 02 14:59:03 bridge-prod heroku/api: Deploy f158691 by xyz@abc.org
System logs are events of the Heroku stack. Of particular interest is the router log:
Sep 03 14:24:20 bridge-prod heroku/router: at=info method=GET path="/?study=api" host=webservices.sagebridge.org request_id=4f839190-0943-8a41-9081-5b9faa862f31 fwd="173.194.33.105" dyno=web.5 connect=4ms service=7ms status=200 bytes=509
App logs are events of the Bridge app. Besides the free text dumped by the app logger, Bridge keeps structured metrics in JSON format:
Sep 03 14:44:04.284 bridge-prod app/web.6: 2015-09-03 14:44:03,859 INFO [application-akka.actor.default-dispatcher-1727] org.sagebionetworks.bridge.play.interceptors.MetricsInterceptor - { "version": 1, "start": "2015-09-03T21:44:02.488Z", "request_id": "3460b989-ba07-4333-ab38-dc687db1469c", "method": "POST", "uri": "\/v3\/auth\/signIn", "protocol": "HTTP\/1.1", "remote_address": "74.23.180.29", "user_agent": "Integration Tests (Linux\/3.13.0-36-generic) BridgeJavaSDK\/3", "status": 200, "end": "2015-09-03T21:44:03.992Z" }
Notice that the request ID is recorded across different types of logs. It can used to trace events of a particular request.
Consumers of the Heroku logs
Logentries
Heroku logs are drained to the Heroku add-on Logentries. Logentries archives the logs permanently to S3 and alerts us on high error rates. For details on alert setup, see the page Monitoring policies.
Librato
Librato monitors stack performance by analyzing the Heroku API logs and system logs.
TODO: - BRIDGE-783Getting issue details... STATUS
TODO: - BRIDGE-784Getting issue details... STATUS
(Note Librato was introduced as a partial alternative to New Relic. New Relic has since been disabled as it does not support Play 2.4.x.)
Redshift
Currently a manual process is carried out regularly to export the structured metrics to Redshift. There the logs can join other tables for the purpose of answering business questions, performing audits, and to a lesser extent, monitoring the system.
TODO: - BRIDGE-500Getting issue details... STATUS
Logging and monitoring on AWS resources
We have CloudWatch alerts set up on DynamoDB metrics. For details, see the page Monitoring policies.
Access logging are enabled for the S3 buckets that store the upload data. Currently the logs are captured but not consumed.
TODO: - BRIDGE-785Getting issue details... STATUS
Another potentially useful service is CloudTrail which captures access to the AWS resources within the account.
TODO: - BRIDGE-371Getting issue details... STATUS