Performance of Single File Download
Currently, each file in Synapse must be downloaded individually. This works well for many use cases but there are cases where this leads to performance issues. For example, In order to download a file a client must first request a perpre-signed URL for the file. The file is can then be downloaded with using a HTTP HTTPS GET on the returned pre-signed URL. For large files, the time spent on the actual download far exceeds the time spent getting the URL. For a small file, the request for the pre-signed URL can take as long as the actual file download. This can be significant bottleneck some use cases. For example, a user may need to download all of the files from a table entity with one or more file columns. If all of the files are small then most of the time can be spent requesting the requests for each file's pre-signed URLsURL becomes the bottleneck.
Download via Associated Object
There is also a secondary issue with the current implementation of file download in Synapse. Since files do not have any type of Access Control List (ACL) or other mechanism for controlling download permission, only the creator of the file handle can directly download the file using the file handle id. In order to make a file available to download to other users, the file's creator must first associate the file handle with an object that does have an ACL, such as a FileEntity, TableEntity, or WikiPage. A pre-signed URL can be requested for an associated file using a service call through the owning Users are then expected to download the file via the associated object. For example to get the The following example illustrates how this currently works:
- User A uploads a file and creates a FileHandle with an id=1.
- User A then creates a FileEntity with an entityId=123 using the FileHandle.id=1.
- If user B has the download permission on entityId=123 then they can request a pre-signed URL for
...
- that file using: GET /entity/
...
- 123/file
- The resulting pre-signed URL can then be used to download FileHandle.id=1.
If user B were to attempt to directly access a pre-signed URL for the associated file handle. This means we need to add special service calls for each type of file handle association.
The API would be simpler to use if clients could simply get pre-signed URLs using only the file handle ID. This would allow all clients to handle file downloads genericallysame file using GET /file/1 then Synapse would return an unauthorized result (403). The reason for this is Synapse does not currently have a generic system for tracking which files have been associated with which object. Instead, when user B calls GET /entity/123/file, Synapse first verifies that user B has access to entityId=123 and then looks up the current file handle id associated with.