Skip to end of banner
Go to start of banner

Glossary

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

There are a variety of terms that you’ll come across throughout the NF Data Portal. Refer back to this page if you’re ever unsure about what something means, or the difference between terms.

Looking for descriptions of metadata and/or annotations? Visit the metadata dictionary instead. And if you need assistance using the metadata dictionary, we have a dedicated page for that here.


Controlled Use Data: While all data uploaded to Synapse is open (see entry for Open Science), some data requires the submission of a Data Use Certificate (DUC). You can find more detailed information on this here.


Controlled Value: a pre-formatted value that must be used as defined.  Ex: True instead of yes; female instead of woman


Data Access: Data access refers to a user's ability to access or retrieve data stored within a database or other repository. Users who have data access can store, retrieve, move or manipulate stored data, which can be stored on a wide range of hard drives and external devices. (Technopedia, https://www.techopedia.com/definition/26929/data-access )


Data Model: a model that organizes data elements and describes their relationships to one another, usually in a graph based form such as a flowchart diagram. The structure of the data is dictated by the data model. In other words, a data model structures how information is organized and related in a particular context. At Sage Bionetworks, the data model typically refers to CSV or JSON-LD files used by SCHEMATIC (Schema Engine for Manifest Ingress and Curation).


Individual ID: An individual ID is the identifier for a specific individual (human subject or single animal).


File annotations: File annotations are a set of controlled vocabulary associated with data files that describe properties of the data to allow for queries. Also known as metadata, these annotations are essentially extra information about the data so that you can properly search and filter through it.


Governance: Due to the open-access nature of the platform, Synapse operates under comprehensive governance policies that define the rights and responsibilities of Synapse users. This includes our standard operating procedures (SOPs), privacy policy, code of conduct, community standards, and more.


Grant: A grant is represented by a contract number assigned to a NIH-funded project.


Metadata: Metadata provides information about the data in the portal. Metadata can be associated with individual subjects and/or with files. Understanding how metadata works is essential to the successful use of Synapse and the AD portal.

Essentially, metadata is extra information included with actual data that tells the software how and where to store data in the intended way. It’s referred to as “structured” data, since it follows a specific format in the form of a table, which acts like a set of instructions so the software (Synapse) knows what to do with it.

Metadata can be structured or unstructured. While some metadata is part of a file annotation (see file annotation entry for more information), additional structured metadata—including clinical and demographic information on study participants—can be found within a set of .csv files that describe the individuals, specimens, and assays.

There is a whole dictionary dedicated to metadata and annotations. Find it here.


Metadata validation: the act of checking metadata for correct values and formatting.


Open Data/Open Science (also referred to as Open Access): Open data represents transparent and accessible knowledge that is shared and developed through collaborative networks, based on the principles of open science. The goal of open science is to make scientific research—including publications, data, physical samples, and software—and its dissemination accessible to all levels of an inquiring society, whether amateur or professional.

The general driving idea behind open science and data is that scientific research can and should be accessible to anyone—because, well, why not? This system benefits all parties involved—the researchers gain wider-reaching recognition and appreciation for their work, the study subjects get to witness the palpable value of providing their personal data, scientists and other professionals are able to use properly funded research to aid in their own research/work, and the general public gains helpful information and knowledge from trusted sources. This is truly a win-win—collective consciousness is a global good!


Program: In our context, a program represents a group of scientists working together towards a common research goal. In this way, “Consortia” is often used interchangeably with “program.” An exception is the Community Data Contribution Program, where researchers outside of the funded programs contribute data and other content to the portal.


Project: In our context, a project is typically associated with an NIH grant. So, the terms project and grant are often used interchangeably. However, some projects span multiple grants, or are led by program partners that are not grant-funded.


Schema: An overlapping concept to data model, a metadata schema provides further rules and standardization of a data model. It outlines additional rules governing the management of metadata through constraints such as the optionality or valid values of attributes.


Specimen ID: A specimen ID is the identifier for a sample from a specific individual – for example, a brain sample from a specific region or a blood sample.


Study: A study is the primary unit of data organization in the portal. Essentially, each study represents an individual research project with specific objectives and focus (one project can operate multiple studies) A study can represent data generated from a specific human cohort, data from experiments on a model system, cross-consortium data processing and analysis efforts, or data associated with a specific publication.


Template: A manifest template is a template, usually an excel spreadsheet, that outlines a collection of specific metadata attributes pertaining to a data type to be filled in. The columns of the template refer to the metadata attributes to be collected for a set of corresponding data. In other words, it describes a set of key/value pairs that can be assigned to a data file(s) of the same data type.


  • No labels