Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Provenance is a concept describing the origin of something; in Synapse it is used to describe the connections between workflow steps that derive a particular file of results. Data analysis often involves multiple steps to go from a raw data file to a finished analysis. Synapse’s provenance tools allow users to keep track of each step involved in an analysis and share those steps with other users.

...

Overview of Synapse Provenance

The model Synapse uses for provenance is based on the W3C provenance spec where items are derived from an activity which has components that were used and components that were executed. Think of the used items as input files and executed items as software or code. Both used and executed items can reside in Synapse or in URLs such as a link to a GitHub commit or a link to a specific version of a software tool.

...

Let’s begin with a script that generates a list of normally distributed random numbers and saves the output to a file. For example, you have an R script file called generate_random_data.R and you’ve saved the output to a data file called random_numbers.txt. We’ll begin by uploading the files to Synapse and then set their provenance.

Upload a

...

File and

...

Add Provenance

For this example, we’ll use a Project that already exists (Wondrous Research Example : syn1901847). The code file is saved in Synapse with synID syn7205215, so we’ll upload the data file to this Project, or in Synapse terminology, the project will be the parent of the new entities.

...