Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This vignette use case will combine concepts from Annotations and queriesAnnotating Data With Metadata/wiki/spaces/DOCS/pages/2011070739Uploading and downloading data in bulkOrganizing Data Into Projects, Files, and Folders. You will learn how to:

  • Create a manifest

  • Upload 100 files

  • Edit annotations on these files using the Synapse programmatic clients

Annotation

...

Dictionaries

The Sage Bionetworks maintains annotation dictionaries in GitHub. You can use the terms in this repository as a starting point, or you can create your own annotation dictionary.

...

Batch Upload Files with Annotations

To batch upload files, create a tab-delimited manifest which contains, at minimum, the columns path and parent. You can also add additional annotations as columns in your manifest. For example, your manifest might have the following headers: pathparentspecimenIDassayspeciesplatformsex, and fileFormat. See Creating a Manifest in Uploading and downloading data in bulk for additional details.

  • path: the local path to your file 

  • parent: the Synapse ID (in the format syn123456) of the folder or project where your files will be uploaded

  • specimenID: the unique identifier for each of your specimens 

  • assay: the technology used to generate the data in this file (for example, RNASeq, ChIPSeq, wholeGenomeSeq) 

  • species: the species of your sample (for example, Mouse, Rat, Human, Triceratops) 

  • platform: the hardware used to generate the data (for example, HiSeq2500, Affy6.0, HoodDNASequencer) 

  • sex: a label assigned at birth based on biological attributes (for example, male or female)

  • fileFormat: is the type of file (e.g. fastq, R script)

path

parent

specimenID

assay

species

platform

sex

fileFormat

/local/path/to/velociraptor_b.fastq

syn123

blue_1

wholeGenomeSeq

Velociraptor mongoliensis

HoodDNASequencer

female

fastq

/local/path/to/velociraptor_d.fastq

syn123

delta_1

wholeGenomeSeq

Velociraptor mongoliensis

HoodDNASequencer

female

fastq

Save this file in a tab-delimited format called velociraptor_manifest.tsv.

...

  • validate the manifest and upload files in the Python client.

  • validate the manifest and upload files in the R client.

Create a

...

File View (web)

Once the files have been uploaded with annotations, you can use a file view to query, facet, and bulk manipulate the files and metadata.

...

  1. Navigate to your project.

  2. Go to the tables tab, select Tables Tools in the upper right corner, and click Add File View.

  3. In the resulting pop-up, give the new file view a name.

  4. Select the container (Synapse project or folder) of files, and click Next. In this case, you would want the synID of the parent column in the manifest.

  5. Select the columns you would like to keep. Since we are going to edit the annotations later, make sure you have the column etag listed as one of your columns.

  6. Add All Annotations at the end of the opened window will add all existing annotations.

  7. Click Finish to create the file view.

(plus) For more information on file views, see /wiki/spaces/DOCS/pages/2011070739.

Perform a

...

One-Time Annotation Update or Deletion (web)

An annotation for a single file can be modified in the web client view. For example, you can updatespecimenID:delta_1 to specimenID:echo_1.

  1. From your view, select the pencil icon to edit query results.

  2. In the pop-up window, find the single value you want to change and edit the field.

  3. Scroll down to the bottom and click Save to update the file view.

Perform a

...

Bulk Annotation Update or Deletion

A bulk annotation update is required in the case that species:Velociraptor mongoliensis should be modified to Utahraptor ostrommaysorum in all 100 files.

...

  1. Query for the file view with synTableQuery() or syn.tableQuery(). To delete all the annotations of a key, you have to keep the column in the file view but remove the values.

  2. Update and then store the annotations in the R client or Python client.

...