Introduction
Auditing data for the Synapse REST API is captured by a Spring Interceptor: AccessInterceptor that is similar to web filter. This interceptor is configured to listen to all web services calls made to the repository services. For each call, the AccessInterceptor will gather data to fill out an AccessRecord model object. The AccessRecord data is then written as zipped CSV files directly to the S3 bucket. These CSV files are initially too small to process efficiently so a worker process merges the files by hour.
AccessRecord S3 Files
All AccessRecord CSV data for a single hour from all EC2 instance of a stack are into a single file. The following is an example of the resulting path:
https://s3.amazonaws.com/prod.access.record.sagebase.org/000000013/2013-09-17/14-39-03-055-75ed6416-438d-4688-8789-3df56f4e4670.csv.gz
The above path is composed of the following parts:
https://s3.amazonaws.com/prod.access.record.sagebase.org/<instance_number>/<year_month_day>/<hour_minutes_seconds_miliseconds>-<UUID>.csv.gz
Here is some sample data from one of the access record files:
returnObjectId | elapseMS | timestamp | via | host | threadId | userAgent | queryString | sessionId | xForwardedFor | requestURL | userId | origin | date | method | vmId | instance | stack | success |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
43 | 1379430033942 | repo-prod-13.prod.sagebase.org | 659 | Synpase-Java-Client/2013-09-13-e70558e-662 Synapse-Web-Client/13.0 | b4331f55-6c65-4f6b-a2c5-ee6830cf7641 | /repo/v1/entity/header | 273978 | 2013-09-17 | POST | eca4eb39c13ac98c:7461944:1412973c3ee:-7ffd | 13 | prod | true | |||||
30 | 1379430033943 | repo-prod-13.prod.sagebase.org | 656 | Jakarta Commons-HttpClient/3.1 | query=select+id,name,nodeType+from+entity+where+parentId+==+%22syn2228808%22+limit+500+offset+1 | b1b9c385-4dba-4e3a-b49c-0f40f7c99ac5 | /repo/v1/query | 273978 | 2013-09-17 | GET | eca4eb39c13ac98c:7461944:1412973c3ee:-7ffd | 13 | prod | true | ||||
14 | 1379430034027 | repo-prod-13.prod.sagebase.org | 1177 | Synpase-Java-Client/2013-09-13-e70558e-662 Synapse-Web-Client/13.0 | mask=64 | 597767ef-8ff2-40d0-a65d-b519f5b2f937 | /repo/v1/entity/syn2228808/bundle | 273978 | 2013-09-17 | GET | 9b5a47b65e8703f0:229cd7a3:1412973c18a:-7ffd | 13 | prod | true | ||||
syn2228807 | 35 | 1379430034057 | repo-prod-13.prod.sagebase.org | 159 | Synpase-Java-Client/2013-09-13-e70558e-662 Synapse-Web-Client/13.0 | e9b15054-dbc6-454b-a1bf-8bef3d5f0fbc | /repo/v1/entity/syn2228808/benefactor | 273978 | 2013-09-17 | GET | 9b5a47b65e8703f0:229cd7a3:1412973c18a:-7ffd | 13 | prod | true | ||||
syn2228807 | 19 | 1379430034107 | repo-prod-13.prod.sagebase.org | 153 | Synpase-Java-Client/2013-09-13-e70558e-662 Synapse-Web-Client/13.0 | 23216ee3-dade-43ac-8efe-fa1e6dc9877d | /repo/v1/entity/syn2228807/acl | 273978 | 2013-09-17 | GET | 9b5a47b65e8703f0:229cd7a3:1412973c18a:-7ffd | 13 | prod | true | ||||
59638 | 39 | 1379430034123 | repo-prod-13.prod.sagebase.org | 656 | Synpase-Java-Client/2013-09-13-e70558e-662 Synapse-Web-Client/13.0 | d7ed19dd-2ed9-47d1-b345-be1aaca0d688 | /repo/v1/entity/syn2228808/wiki | 273978 | 2013-09-17 | GET | eca4eb39c13ac98c:7461944:1412973c3ee:-7ffd | 13 | prod | true |
Column Description
- returnedObjectId - For any method that returns an object with an ID, this column will contain the returned ID. This is the only way to determine the ID of a newly created object from a POST.
- elaseMS - The elapse time of the call in milliseconds.
- timestamp - The exact time the call was made in epoch time (milliseconds since 1/1/1970).
- via - The value of the "via" header (see: http://en.wikipedia.org/wiki/List_of_HTTP_header_fields)
- host - The value of the "host" header (see: http://en.wikipedia.org/wiki/List_of_HTTP_header_fields)
- threadId- The ID of the thread used to process the request.
- userAgent - The value of the "User-Agent" header (see: http://en.wikipedia.org/wiki/List_of_HTTP_header_fields)
- queryString - The value of the "queryString"