Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

It turns out that Alice's paper is a hit and now she has lots of biologists asking for help running similar analyses on different data sets. She converges on a particular structure to capture the results of various intermediate stages of her analysis, e.g. to help out a new collaborator (Diane) in a new project:

  f <- Folder(parent='DianesProject', name='Stage 1 Results')

Annotate to describe the results.  A couple integer values, e.g. certain key statistics of the result

  f$annotations$Pval <- 1  // or even more ideally just f$Pval <- 1
  f$annotations$Fval <- 4
  f$annotations$status <- 'valid'  //A text annotation

Above, I am assuming this is syntatic sugar for things like

  f <- setAnnotation(f, key='Pval', value='1', type='int')

Assume we have a handle to some text file

  f$annotations$result1 <- someFileHandle

Syntatic sugar for

  f <- addFile(f, someFileHandle, location=local)  
f <- setAnnotation(f, key='result1', value=referenceToFile, type='file')

Could do the same thing with other objects that get serialized to files

  f$annotations$result2 <- anotherRobject //Save as serialized R binary?
  f$annotations$image <- plot  //Save as image
  f$annotations$vectorData <- {3,4,5,6}  //In principle could be very large, store as file?
  f$annotations$matrix <- {2,3,4;3,4,5;4,5,6} //In principle could be very large, store as file?

Push everything up to Synapse:

  storeEntity(f, recurse='true')

Behind the scenes the client must do this

0. Start a transaction to upload a bundle of entities

1. Create someFileHandle as a child File of f of type .txt.  Synapse generates preview of it

2. Create anotherRobject as a child File of f of type .Rbin  Preview?

3. Always, (or only if vector / matrix are large), create them as additional child Files (.csv?)

4. Create another File to store the plot

5. Update the annotations on f to include the 3 primitive types, and 5 additional annotations references child files.

6. End the transaction to upload a bundle of entities

  bundleId <- startBundle()
  tempFile1 <- File(parentId=f$id, bundle=bundleId)
  tempFile1 <- storeEntity(tempFile1)

.... upload more pieces

  endBundle(bundleId)

Another user must be able to do this to get back the same data:

  f <- synGet(path='DianesProject/Stage 1 Results', recurse='true')

Alice turns her set of scripts into a publicly-hosted R package.  This includes the development of R objects specific to her analysis that encapsulate some of the key steps / data structures that are handed off between different steps. She also includes helper functions that store and retrieve the pieces of the object in Synapse as a set of folders, files, and annotations that follow a particular convention.  She then develops a widget for the Synapse UI that presents a visualization of this data in a way understandable by her collaborators.  This gives her and other analysts an object-centric view of the data structures relevant to this analysis in R, and the ability to easily load and save these objects to/from Synapse.  Other analysts can do the same thing in other environments (e.g. Python) by defining similar objects and helper functions. 

...

If we have many of these sorts of objects, an extension to this use case is for Synapse to provide central storage, retrieval, of these object definitions, and / or ways to autogenerate the objects and helper functions them from existing synapse data structures used as prototype instances.