Bulk File/Table Download via Web Client
Goals
- Simplify and streamline the process of accessing data in Synapse by allowing web-client users to download more than one file at a time.
- One-pager: https://docs.google.com/document/d/16KU7vhBcwBl9xI8U7DtRsVsIlp_CwuSPdJVSLm11y2Q/edit#
Assumptions
- Users of this feature are primarily web-client users as comparable functionality already exists in the programmatic clients.
- Users of this feature will accept some tradeoffs (limitations) in exchange for this functionality, e.g. limits on the number of files and/or the total size of the download.
- User of this feature are primarily external (non-Sage) users, such as consortia members or the general public.
Use Cases
Background: Alice is a researcher who wants to download data relevant to her work, or a "data consumer". She's interested in finding and accessing datasets in Synapse that have been published by the AMP-AD consortium. Bob is an employee at Sage who helps curate data (organize, annotate) for the AMP-AD consortia, or a "data curator". Bob wants to make sure that Alice is able to discover and access (download) the right files she needs, and Alice wants to be able to download them efficiently through her web browser. Alice does not know how to use the programmatic clients.
Use Case 1: Download from Directory/Folder
Alice wants to download the contents of a Folder in Synapse. Bob has organized the relevant Files into meaningful Folders so that Alice can easily download datasets.
Workflow
- In the Synapse web client, Alice navigates to the Folder she wants to download
- Alice selects "Download Options" and chooses the relevant options to select all* (see: Limitations) the contents of the folder to add them to her download list.
- Alice navigates to her download list and creates a .zip package of the files she's selected for export.
1a: Download entire Folder
1b: Download partial contents of Folder
Note: Use Case 1b (Download partial contents of Folder) is not currently deemed a high priority so this use case will be supported through either a combination of Use Case 1a (Download entire Folder) with Use Case 4 (Refining the download list) OR via Use Case 3 w/ Alice selecting the things she wants from the Folder.
Use Case 2: Download from View
Alice wants to download the files/tables represented in a View (e.g. a FileView) that's been curated by Bob. Bob has created this FileView to leverage relevant annotations to present different kinds of datasets to Alice; Alice wants to peruse the FileView and choose the files she wants to download.
Workflow
- In the Synapse web client, Alice navigates to the FileView she wants to download files from.
- Alice selects "Download Options" and ...
- Alice navigates to her download list and creates a .zip package of the files she's selected for export.
2a: Download entire set of things represented by a View
2b: Facet/subset View prior to downloading
Use Case 3: Download Individual Selections
Alice wants to "shop" for individual files as she browses through various projects. Bob has no involvement in this case.
Workflow
- In the Synapse web client, Alice navigates to various files and folders.
- Alice clicks on the download icon next to each item and is presented with the option to download file immediately or add file to download list.
- If Alice chooses to download each selection individually, no download list is created.
- Items are only added to the download list IF Alice chooses them.
- When Alice is done choosing her selections to be added to the download list, she visits her download list and begins downloading.
Use Case 4: Refining the Download List
Alice has added items to her download list but realized that she doesn't need all of them right now. She wants to edit the list to remove items prior to creating her download package.
Workflow
- In the Synapse web client, Alice navigates to her download list (either via her profile page or via the notifications in the header)
- Alice uses the available columns to sort the items in her list, and can "uncheck" items that she wants to omit from her package, or delete items from the list entirely.
- Alice then proceeds to create and download her package.
4a: We presume Alice has "download" access to all files included in the list.
4b: Alice has access to download some or none of the files and must request access. "Request Access" button in the list design should go to the Access Requirements page for that entity, similar to how the "Unmet Conditions" button works today, we think. Download package should only include files to which Alice has permission to download.
Use Case 5: Exporting the Download List for Programmatic Usage
Alice realizes that she doesn't want to download through the web after all, but would rather wait until she gets back to her office to download using one of the programmatic clients (let's pretend that in between the background and use case #5 Alice has learned how to do this).
Workflow
- From Alice's download list, she choses the "Export List" option from the menu.
- Alice downloads a file-manifest-type-thing instead of the individual files and uses this list to download files another way.
Use Case 6: Download History
Alice accidentally wiped her local storage and wants to re-download the data from Use Case B.
Workflow
- In the Synapse web client, Alice navigates to her download list and clicks on "Download History".
- Alice remembers that it was the most recent package that she needs and sorts the history back package date.
- Alice clicks on "Download Package" to get the same .zip package again.
Limitations and Errors
In order to deliver a maximally functional initial feature, we need to set some limits to ensure the feature will be performant and users will understand the expected behavior.
- Limit of total number of files in a single download package: up to 100
- Limit of total package size: TBD
- Access Restrictions - how to handle?
- Too many files in the list - how to handle?
- Successful download - clear the list and/or retain the list elsewhere?
Priorities
P0: For initial release, it is critical that we support #1a, #2a, #2b, #3, and #4 above. It is acceptable to phase in these features as the team deems best, over a relatively short period of time.
P1: I expect that users will ask for #5 in short order and it's a way to bridge the gap between web users and programmatic users, as well as set us up for #6, so it's a medium priority but may save us time downstream if we do it up front.
P2: Use Cases #1b and #6 do not yet have a clear user need, so we will de-prioritize them until we have more data.
User Interaction and Design
https://www.figma.com/proto/cjRT94oMAbvuSC8nwwoddDEk/multi-file-download
Open Questions
Question | Outcome |
---|---|
Should we support the case where a user starts their download list in one session and then resumes and completes the download in in another? | No documented user need for this (yet) but we imagine it might become a case. |
What is a sensible limit on the number of files and/or the total package size? | 100 files for initial release; need to determine sensible max package size. |
Should we describe failure cases separately or incorporate those into each use case above? | TBD. For example, user attempts to add a folder to the download list but folder contains too many items. What is the expected interaction there? |
Not Doing (Out of Scope)
- Bulk download of fileHandles within Tables, a.k.a. the mobile BRIDGE data use case.
- Bulk download of submissions from evaluation queues, a.k.a. the challenge use case.