Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Raw Data or Data: This can be further partitioned to house different types of raw data. For raw data types and formats commonly seen in this location, see How to Format Your Data.

  • Milestone Reports or Reporting: This should house the summary reports that link data files to specific award milestones.

  • Analysis: This can house the protocols, code, protocols, and derived results that comprise an analysis performed on raw data.

Info

To make analysis code more reproducible, Docker images can recreate the environment that includes software dependencies and configurations needed for the analysis. Each project has its own Docker Registry to store and distribute their Docker images per Synapse project. See https://help.synapse.org/docs/Synapse-Docker-Registry.2011037752.html.

New NF community contributions should go into these core containers as delineated. Some older projects or independent projects (not sponsored by one of our funders) may not have this exact top-level scheme. Sometimes a project may create an additional folder to house materials that fall outside the scope of these containers, which is usually not usually an issue.

Good More details and examples are also provided for each in the following subsections.

...

The Synodos NF2 project provides a good working example for organization of multiple raw data types within the Data folder. Here data are largely grouped by release year and type. Data, illustrating several guidelines:

  • Data type is the first and most important grouping factor. Create separate folders for each data type, e.g. an “RNA-seq” folder that will have .fastq files.

Info

A metadata schema can be applied at the folder level for describing all files within that folder. Since metadata are specific to data types, this is simplest when the files are of the same type.

  • For each data type, the data can be further grouped however makes the most sense for the study. The example above further groups RNA-seq data by release year, but other reasonable factors could be by cohort if there are different cohorts.

  • Separate original raw data from processed data. A folder can be created to store the processed versions.

Milestone Reports

Analysis

Supplemental folders