Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

New projects are When we set up with your Synapse project, we add a basic folder structure that data contributors can build upon. This page documents best practices for organizing data and other materials within your NF project. The organization of your data can also affect the later annotation workflowIf you follow these recommendations, it will make the process of annotating your data easier.

Project Folders

A new Synapse Project is initialized using a default structure with these three folders:

...

What if I have something that is not raw data, milestone report, or analysis?

A project may also You can create an additional top-level folder to house materials that fall outside the scope of these containers, which is usually not an issuethe pre-generated folders.

Raw Data or Data

This folder is intended to be further partitioned for different types of raw data. For raw data types and formats commonly seen in this location, see How to Format Your Data . In data. We recommend (https://sagebionetworks.jira.com/wiki/spaces/NPD/pages/2137326583/How+to+Upload+Data#3.-Create-a-folder-for-your-data , we advise ) that you create a folder under this location new folder within the “Raw Data” folder for each data type.

For raw data types and format recommendations, see How to Format Your Data .

Working Example

The Synodos NF2 project provides a good working example for organization of multiple raw data types within Data. It demonstrates these several guidelines:

...

Info

A metadata schema can be applied at the folder level to describe all files within that folder (and any sub-folders). Since metadata are specific to data types, having the same type within a folder helps keep metadata valid and consistent.

  • For each data type, the data can be further grouped however makes the most sense for the study. The example above further groups RNA-seq data by release year, but other reasonable factors could be used, e.g., by data type and cohort if there were multiple different cohorts.

  • Original raw data are separated from processed data. A folder can be created to store the processed versions.

...

This can house the protocols, code, and derived results that comprise an analysis performed on raw data.

Info

Alongside In addition to the Analysis folder, each project has its own Docker Registry to store and distribute Docker imagesanalysis code. To make analysis code more reproducible, Docker images can recreate the environment that includes include both the code and the software dependencies and configurations needed for to run the analysis. See https://help.synapse.org/docs/Synapse-Docker-Registry.2011037752.html.

...