Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

At this point her Synapse project is populated with a mirror of her local filesystem folder, although all the files are still living exclusively on her local file system.  Synapse has some metadata on the files and folders (e.g. SHA1, timestamp and user of when they were created, maybe file size).

...

At local OS command line

  syn updateget . -recurse = true

Pulls down the two new plot files locally.  Dataframe could be either .csv or Rbinary file.

At this point she switches over the to the Synapse web client and uses previews of the two new results files and the wiki features to write up a summary of her findings in the project wiki. Then she adds Bob to the project and emails him a link to view the results.

Bob is able to review Alice's findings, comment on the wiki pages.  He's got some new data he wants to share with Alice so he uploads it to the project from the web client.  Alice is able to pull the files down using an analytical client and continue working.

Later, Alice would like her analyst friend Carl at another institution to check her analysis.  (or would like a backup of her work, or access to it from another machine...)

  syn put . -recurse = true -location = SynapseStorage

This pushes the files up to Synapse's native S3 storage.  Carl can now move the project over to his own computer, or his Amazon account.  (Why not just sync files using Git, or Dropbox, or any number of other solutions?  Assume some of the files are large. e.g. raw genomics data.  In this case files always remain local, and if Carl wants to access them he will get an account on Alice's system.  Different folders of the project might be stored in different places.)

The project could evolve for sometime in this fashion, mainly relying on the file-folder API, wiki, and collaboration features.  Extensions could be to have users manage multiple storage locations (e.g. their own S3 buckets), or have clients that automatically synched content in the background.

Reproducible Ad hoc analysis