Skip to end of banner
Go to start of banner

Bulk File/Table Download via Web Client REST API

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

For use cases see: Bulk File/Table Download via Web Client

Introduction

The bulk file download via the web client is a new feature that will allow users to select files, review/refine the file selection, and then download all of the files as a single zip file.  The workflow consists of five basic phases: file selection, selection review, download, download review.   While a typical workflow flow might involve the user moving through the all of the phase in order, the user is free to move between phases at will.

File selection

The user's goal for this phase will simply be to select files they wish to download from various sources in Synapse. This phase is similar to the product selection phase of an online shopping experience. In this phase the user will be able to select files from the following sources:

  • Add all files from a folder.  Note: This operation is not recursive, so files within sub-folders will not be added.
  • Add a single file from a folder.
  • Add all of the files listed in a view query.  We do not plan to support adding individual files from a view, instead the user is expected to refine their query to sub-select files from views.  Note: This operation will add all files from the view query, not just the single page of files shown in the UI.

Selection review

The user's goal for this phase this phase is to review and refine the files they selected prior to actually starting the download.  This phase is similar to the review of a shopping cart/basket of an online shopping experience. All of the files the users selected in the file selection phase will be consolidated in the user's private download list.  The download list will include the following information about each file:

  • Link to the original file
  • File size
  • File availability:
    • Are there any unmet access restrictions on the file?
    • Is the file type available for bulk download?  For example, external file links and SFTP files will not be available for bulk download.

For each file in the download list the user will have the option to perform the following actions:

  • Request access for unmet access restrictions
  • Remove the file from the list.

The user will have the option to perform the following actions on the entire download list:

  • Clear the list
  • Download the list

Download

The user's goal for this phase will be start the actual bulk download of their files.  This phase is similar to the checkout phase of the an online shopping experience.  In this phase the user will provide a name for their zip file and will be presented with the sub-set of files that will actually be included in the download (unavailable files will be excluded).  When the user chooses to proceed with the download, the download transaction will be stared (see below for details on the download transaction).

Download Review

When a user starts a file download, they will enter the last phase: download review.  In this phase the user will be able to monitor the download progress.  After the requested zip file is prepared, the user will then be able to download the zip file to their machine.

Limitations

Managing file selection across a paginated list of results creates an awkward user experience.  Therefore, the entire download list must be presented to the user without pagination (scrollbars are allowed).  This means there must be a limit on the number of files allowed in the download list.  The download list must be small enough to be fetched as a single web-service request.  A download list will have a limit of 100 files.

There is also a limit to how long a users will wait for their file downloads to be created.  Currently, it can take 10 minutes to prepare a 2 GB zip file for download.   A download list will have a maximum size of 2 GB (sum unzipped files must be less than 2 GB).

File sizes

One of the implied requirements from Ljubomir's design is the availability of the total size of all files in both view query results and folder navigation.  The file sizes will be used to estimate the download time based on the user's current network speeds.  The sizes will also be used to help the user keep their download list under the maximum size.

View Query Results

Since the user will only have the option to add all of the files from a give view result, and not just the currently shown page, the file size results will need to include the size of all files for a given query.  This is similar to the query count already available to in table query results.   The proposal is to add a new mask to the existing QueryBundleRequest.partMask call 'fileSizes' with a value of 0x20.  When the 'fileSize' mask is include the resulting QueryResultBundle will include a numeric value called 'sizeOfAllFilesMB'.

Folder Navigation

Unlike View query results, users will have the option to add one file at a time from the folder navigation.  This implies that we will need to show the size of each individual file in the folder navigation.  We should be able to use the existing POST/fileHandle/batch to get the file handles for a single page of files shown in the folder navigation.

To support adding all of the files in a folder (non-recursive) (use case 1a) we will need to return the total number of files in a folder and the total size of all files in the folder from POST/entity/children.  The proposal is to add a 'partMask' (similar to QueryBundleRequest) with 0x01=count and 0x02=totalFileSizeMB.

Download Transaction

When a user chooses to to start the download process the following operations will occur in a transaction:

  1. User's download list will be locked.
  2. A download delivery will be created that includes all of the files from a user's download list excluding unmet access restrictions and non-downloadable files.
  3. An asynchronous download request job will be started for the download delivery created in step 2.
  4. All files in the download delivery will be removed from the user's download list.
  5. Releases the lock on on the users's download list.

If any errors occur during this transaction all changes will be rolled back.  The user will be blocked from making changes to their download list during this transaction.

  • No labels