This is a sub-task of - BRIDGE-2660Getting issue details... STATUS .
Overview
We have a need for storing non-JSON files accessible only to the participant that submitted it. One example use case is skin plaque photos from the Psorcast app.
This is similar to the File Service API in that both allow you to submit and retrieve blob data to Bridge, but this new API is scoped to a participant.
This is similar to Participant Reports (and is largely intended to supplement the Participant Reports), but is different in two key ways: First, it’s blob data instead of JSON. Second, it’s not keyed to a date or timestamp.
This API will use S3 as the backing store and provide an API to upload and download the files. (S3 Presigned URLs will be used for the actual upload and download.) DynamoDB will be used to index the files, with the participant healthcode as the hash key and the file ID as the range key.
Other Considerations
This data may contain PHI. We’ll need to secure this endpoint accordingly.
We probably don’t need versioning for these files.
Do we need separate create and update APIs for participant files? App developers specifically asked for the ability to specify their own identifiers, and also we don’t do versioning. So both APIs would do the same thing (ie write the file to the index and generate the pre-signed URL).
Since apps can upload arbitrary files, do we need a virus scanner for these files?
Do we need an researcher API to view at participant files?
We may want a worker API to write these files, to enable Bridge Health Data Post-Processing to write these files.
What kind of indexing do we need on this API? Do we need to view all files for a participant? For a study? For an identifier? If so, considering using SQL to index. (However, if we only have specific indexing concerns, we may want to stick with DynamoDB, as that’s significantly develop against for simple lookups.)
One possible optimization is to use a streaming API to stream the files through Bridge to/from S3. However, this has its own set of pros and cons.
Unlike Health Data, we don’t need an “upload complete” API, since we don’t do processing on Participant Files.
REST API
Method | URL | Description | Permissions |
---|---|---|---|
GET | /v3/participants/self/files?start=<start>&offset=<offset> | Paginated API to return a list of file metadata. (Does not create pre-signed URLs.) | participant |
GET | /v3/participants/self/files/<file ID> | Returns the S3 pre-signed URL for download. | participant |
POST | /v3/participants/self/files/<file ID> | Writes the file to the index and returns the S3 pre-signed URL for upload. | participant |
DELETE | /v3/participants/self/files/<file ID> | Physically deletes the file in both the index and in S3. | participant |