Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As a user of the portal, you can engage with as many of these stages as fits your needs. Each stage of the data lifecycle manifests on the portal in different ways.

  1. During the data generation phase, data are uploaded to the data storage platform, Synapse. During this phase, the data are typically not available for download on the portal, but some information is exposed, such as study title, study description, and metadata.

  2. Data curation mostly occurs behind the scenes of the portal on our data storage platform, Synapse, but this stage enables data discovery, powering search and exploration on the portal.

  3. Data analysis surfaces on the portal through biological and computational tools and is boosted through information available on the portal such as metadata and provenance.

  4. Data interpretation is enabled through various Synapse features, including wikis and discussion forums, but can also be explored on the portal via published data, associated publications, and tools.

  5. Data dissemination includes the NF Data Portal, journal publications, and other means of data distribution.

Info

For an in-depth review of the NF Data Portal’s community engagement and structure, please see our article in Scientific Data.

...

At this point, you know what the NF Data Portal is, and have likely come across the term Synapse - but how do they fit together? Let’s break this down:

Sage Bionetworks

First, there’s Sage Bionetworks - a name you may or may not have come across. While Sage is not a tool you’ll be using, you should know who we are: a non-profit organization based out of Seattle, Washington. Sage is dedicated to promoting and advancing open science, as well as engaging patients in the research process. Sage acts as the Data Coordinating Center (DCC) for several different portals, including the NF Data Portal. The scientists, developers, and designers that built the tools you’re using are all employed by Sage. You can learn more about Sage Bionetworks and its initiatives here.

Synapse

In line with advocating for open science, Sage developed a software platform called Synapse. This platform is what allows for collaborative data curation and analysis, computational modelling, and more. It allows users to upload, store, analyze, and track data in a private space, before releasing it to the public-facing NF Data Portal. Think of Synapse as the back-end for all the data to live in.

NF Data Portal

If Synapse is the back-end for data, the NF Data Portal is the front. It’s essentially the user interface (UI) or entry point for you to view data and other shared content. Data gets uploaded into Synapse, where it is then processed into readable form for you to access in the portal.

NF Data Standards

Data standards underpin data sharing and make it possible to successfully explore, access, analyze, and reuse data. Data standards involve:

...

To allow for data standards, we control the terminology used for values through (meta)data dictionaries and other tools. Using controlled vocabularies and other data standards allows you to find what you’re looking for on the portal, so that you don’t have to search through multiple terms for the same thing. For example, instead of ribonucleic acid sequencing, or RNA-Seq, we use the value rnaSeq RNA-seq.

You can find our full data dictionary here and as regular releases of a JSON-LD file on our nf-osi github hereat our open-source GitHub repository.