Document toolboxDocument toolbox

FileHandle API

FileHandle 

The FileHandle is an object that represents a file that has either been uploaded to Synapse, or resides external to Synapse.  The FileHandle provides basic metadata about file:


FieldDescription
idThe unique identifier of a file handle.  This ID is used to reference a file handle.
etagThe etag of a file handle will change if the file handle changes.  For the most part FileHandles are immutable, with the only exception being assigning a preview FileHandle ID.
createdByThe ID of the user that created this FileHandle.  Only the user that created a FileHandle can assign it to a file entity or wiki attachment.
createdOnThe date on which the file handle was created.
concreteTypeFileHandle is an interface with at least three implementations.  This field is used to indicate which concrete implementation is used.
fileNameThe name of the file. This field is required.

There are currently three concrete implementations of FileHandle:

Object nameconcreteType
ExternalFileHandleorg.sagebionetworks.repo.model.file.ExternalFileHandle
S3FileHandleorg.sagebionetworks.repo.model.file.S3FileHandle
PreviewFileHandleorg.sagebionetworks.repo.model.file.PreviewFileHandle

ExternalFileHandle

An external file handle is used to represent an external URL.  Note that ExternalFileHandle implements HasPreviewId.  Synapse will try to automatically generate a preview for any external URL that can be publicly read.  The resulting preview file will be stored in Synapse and represented with a PrevewFileHandle.  The creator of the ExternalFileHandle will be listed as the creator of the preview.

S3FileHandle

When a file is stored in Synapse, by default it is stored in Amazon's S3.  The S3FileHandle captures the extra information about the S3 file.  Just like ExternalFileHandles, Synapse will attempt to automatically create a preview of all S3FileHandles.

PreviewFileHandle

When Synapse creates a preview file for either an ExternalFileHandle or an S3FileHandle, the resulting preview file will be stored in S3 and be assigned a PreviewFileHandle.  Currently, Synapse will generate previews based on the original file's contentType. See Internet Media Type.

HTTP Types

For any web services where a file is sent with a POST, the content-type must be 'multipart/form-data', see:HTTP Multipart.  The content-type of all service responses will be 'application/json'.

Note: Unless otherwise specified all FileHandle services use a new endpoint:  https://file-prod.sagebase.org/file/v1.  Also standard Synapse 'sessionToken' must be included in all requests.

API

URLHTTP TypeDescription
/externalFileHandlePOSTPost an external File Handle.   Note: The body of the request is an ExternalFileHandle object, wrapping the URL to be stored.
/fileHandle/{handleId}GETGet the FileHandle for a given FileHandle ID.  Only the original creator of the FileHandle is authorized to get a FileHandle or assign a FileHandle to a Synapse Object such as WikiPage attachment or FileEntity.
/fileHandle/{handleId}DELETEDelete a FileHandle by its ID.  This will also trigger the delete of the corresponding file in S3 (when relevant) and any preview automatically generated for the FileHandle.

/fileHandle/{handleId}

(Proposed)

PUT

Updates an existing file handle. Only the owner of a file handle or an administrator may update a file handle. The following fields may be modified by a file handle owner: fileName, and contentType. The following fields may be modified by an administrator only: bucketName (S3 FileHandles only). If bucketName is modified, the underlying file will be copied to the new location. The key may also be modified. bucketName must be changed to an existing bucket that Synapse has access to.

Chunked File Upload API

While it is possible to upload very large files with a single HTTP request, it is not recommended to do so.  If anything were to go wrong the only option would be start over from the beginning.  The longer a file upload takes the less likely restarting will be acceptable to users.  To address this type of issue, Synapse provides 'chunked' file upload as the recommended method for upload all files.  This means the client-side software divides larger files into chunks and sends each chunk separately.  The server code will then reassemble all of the chunks into a single file once the upload is complete.  Any file that is less than or equal to 5 MB should be uploaded as a single chunk.  All larger files should be chunked into 5 MB chunks, each sent separately.  If any chunk fails, simply resend the failed chunk.   While this puts an extra burden on client-side developers the results are more robust and responsive code.  The following table shows the four web-service calls used for chunked file upload.  For these calls the request and response objects are not the same, so both will be shown:


Response (type)URLHTTP TypeRequest (type)Description
1ChunkedFileToken (application/json)/createChunkedFileUploadTokenPOSTCreateChunkedFileTokenRequest (application/json)Create a ChunkedFileToken.  This token must be provided in all subsequent requests.
2URL (text/plain)/createChunkedFileUploadChunkURLPOSTChunkRequest (application/json)Create a pre-signed URL that will be used to PUT a single file chunk. This step is repeated for each chunk.
3UploadDaemonStatus (application/json)/startCompleteUploadDaemonPOSTCompleteAllChunksRequest(application/json)After all of the chuncks have been PUT to the pre-signed URLs a daemon is started to put all of the parts back to together again into a single large file.  The call will start a Daemon and return a UploadDaemonStatus object.  The caller will need to pull for the daemon status using next call (/completeUploadDaemonStatus) below and wait for the daemon to transition from a state=PROCESSING to state=COMPLETE, at which time the status will contain the newly created FileHandleId.
4UploadDaemonStatus (application/json)/completeUploadDaemonStatus/{daemonId}GETnoneGet the status of the daemon started in the previous call.  The client should pull this status until the state changes to to either COMPLETE or FAILED.  Once the state changes to COMPLETE the status object will contain the resulting FileHandleID.  If the daemon fails, (state=FAILED), the status object will contain an errorMessage that provides some information about what went wrong.  While the state is PROCESSING, the percentComplete field of the status will inform about the progress being made.

Associating FileHandles with Synapse objects

FileEntity

See: Entities, Files, and Folders Oh My!

Wiki pages

See: Wiki API (Alpha)