Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Working with a large number of files on the web can be tedious, especially if you want to download, upload, or set annotations and provenance. The command line, Python client, and R client have convenient functions for bulk upload and download. Uploading requires a tab delimited manifest where each row in the manifest specifies the File file to be uploaded and, optionally, annotations to be applied. . Downloading in bulk requires identifying a container (Folderfolder, Project project, Table table, or Viewor view) that contains your Files files of interest.

In this article, you will learn how to:

  • Create a manifest

  • Upload Files files in bulk

  • Modify Files files in bulk using a manifest

  • Download Files files in bulk

Uploading

...

data in

...

bulk

Creating a

...

manifest

Files to be uploaded are specified in a tab separated (.tsv) manifest. The manifest has columns that contain information about each File file to be uploaded along with annotations that will be associated with the File file in Synapse.

The required columns in the manifest are:

  • path: the current directory of the File file to be uploaded

  • parent: the Synapse ID of the Folder folder where Files files will be uploaded

You can also create Provenance provenance for a File file during bulk upload. Adding a used column indicates Files files that were used to create the one being uploaded, and the executed column can indicate code (in Synapse or on the web) that was used to generate the Filefile. Here is an example manifest that uploads a single Filefile:

path

parent

name

used

executed

emotion

species

/path/to/file.csv

syn1234

Tardar Sauce

syn654

https://github.com/your/code/repo

grumpy

cat

The above manifest describes a file.csv that will be uploaded to the Synapse folder syn123 and named “Tardar Sauce”. The manifest describes the Provenance provenance of the File file indicating that it was generated using code deposited in GitHub (https://github.com/your/code/repo ) from the data in syn654. Additionally, the File file has been annotated with emotion: grumpy and species: cat. Additional annotations could be associated with the File file by adding more columns.

To review:

  • The path and parent columns are required

  • The name is only necessary if the displayed name in Synapse should be different than the name of the uploaded file

  • Used and executed are optional for provenance (but helpful!),

  • Emotion and species are optional annotations (but also helpful!)

Download the template.

Validate the

...

manifest and

...

upload files

The format of the manifest file (called filesToUpload.tsv in this example) can be validated prior to upload by using the parameter dryRun in syncToSynapse.  The dryRun will  parameter will not upload the data specified in the manifest file. Instead, the client checks that: the manifest file format is correct, all file paths exist, all files are unique, Provenance provenance can be set (optional), and the parent synID exists. The number of files and total upload size is also summarized in the dryRun output. This helps ensure your data upload does not end prematurely due to a typo in the file path or parent synID.

...

After validating the manifest, you can now upload the files to Synapse by removing the dryRun parameter. Once the upload is complete, you will receive an email notification. This notification will also show any errors from the upload.

Downloading

...

data in

...

bulk

Files can be downloaded in bulk using the syncFromSynapse function. This function allows you to download all the Files files in a Folder folder or Project project along with all the annotations and Provenance provenance on those files. A manifest file called SYNAPSE_METADATA_MANIFEST.tsv that contains the metadata will also be added in the path.

Editing in

...

bulk

You can modify values in the manifest and re-upload them to Synapse using syncToSynapse to edit files in bulk. The manifest allows you to modify everything: file path, provenance, annotations, and versions. If the files have not changed and you only want to update the file annotations, add a column called forceVersion to the manifest with the value False for each row. This will stop syncToSynapse from uploading new versions of the files.

You can also update annotations using File Views/wiki/spaces/DOCS/pages/2011070739.

Please note that you cannot move items in Synapse with a manifest. If the parentID is changed, it will create a copy and the file will exist in two different locations.

Info

Note: Changing the parent synID in a manifest creates a copy of the Filefile. It does not move it.

See

...

Als

Downloading DatadataProvenance/wiki/spaces/DOCS/pages/1972470373Annotations and QueriesqueriesFile ViewsFiles and Versioning/wiki/spaces/DOCS/pages/2011070739/wiki/spaces/DOCS/pages/2668134540