Metadata is standardized information about your data and will be used to annotate the data files in Synapse. Contributors must submit these Four metadata files are required for each study:

Individual

...

Biospecimen

...

Assay

...

Manifest

UPDATE - May 2023

The following approach will be deprecated by Q3 2023. Stay tuned for updates to the workflow.

The climb database variables that are exported for Synapse are mapped here https://www.synapse.org/#!Synapse:syn26137185

Metadata Templates

Metadata templates provide guidance for allowed variable keys and values on the dictionary and values worksheets. The AD Metadata Dictionary includes the latest information about allowed values. The latest templates are linked below.

In order for your metadata to validate successfully, download the latest Metadata Templates, populate with the relevant information and validate. Once you have populated the metadata fields, export the template worksheet from each file as a plain-text comma-separated file (CSV). and follow the validation instructions below.

...

Individual Animal

This file contains metadata about each individual animal that is part of a study. Each animal will be described in one row with information that is true of the animal as a whole (eg, individualID, genotype).

Template: template_individual_animal_model-ad.xlsx

...

Biospecimen

A biospecimen is a sample of cells, tissue, RNA, DNA, etc. This metadata file contains information about each biospecimen that is part of a study, including details like what organ and tissue the specimen is from.

...

Templates:

Multiomics refers to the study of multiple biological systems or processes simultaneously using various techniques that generate different types of data. Here is a summary of some common omics data types, their assays, analyses, and file formats:

Genomics: This involves studying DNA sequencing and analysis, as well as gene expression
profiling through RNA sequencing.

Common assays: Sanger sequencing, whole-genome sequencing (WGS), RNA sequencing (RNA-seq)
Analyses: Variant calling, assembly, alignment to reference genome, differential expression
analysis
File formats: FASTQ, BAM, VCF, TSV
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1287035/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4694403/ >

Proteomics: This refers to the study of proteins and their interactions in biological systems.

Common assays: Mass spectrometry (MS), microarray analysis, immunoprecipitation (IP)
Analyses: Peptide identification, protein quantification, network analysis, pathway enrichment
File formats: MGF, MSG, TAB, PDB
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2564307/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1893633/ >

Metabolomics: This involves studying metabolic pathways and biochemical processes in living
organisms.

Common assays: GC-MS, LC-MS, NMR spectroscopy, mass spectrometry (MS) ionization
Analyses: Metabolite identification, quantification, pathway analysis, network analysis
File formats: CSV, MATLAB, SIMCA, JUPYTER notebooks
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1592706/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3604083/ >

Transcriptomics: This refers to the study of RNA expression patterns in biological systems.

Common assays: RNA sequencing (RNA-seq)
Analyses: Differential expression analysis, gene set enrichment analysis, pathway analysis,
network analysis
File formats: FASTQ, BAM, VCF, TSV
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2571094/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2600283/ >

Epigenomics: This involves studying changes in DNA methylation patterns or other epigenetic
marks that regulate gene expression without changing the underlying DNA sequence.

Common assays: Whole-genome bisulfite sequencing (WGBS), ChIP-seq, ATAC-seq
Analyses: Methylation profiling, ChIP-binding site analysis, histone modification analysis,
network analysis
File formats: BAM, VCF, TSV, MATLAB
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2918536/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4078701/ >

Immunomics: This refers to the study of the immune system and its interactions with other
biological systems.

Common assays: Flow cytometry, mass cytometry, single-cell RNA sequencing (scRNA-seq)
Analyses: Cellular composition analysis, functional annotation, network analysis, clustering
File formats: FCS, MATLAB, R, Seurat
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6072154/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5803139/ >

Metagenomics: This involves studying genetic material from microbial communities in various
environments.

Common assays: Shotgun sequencing
Analyses: Taxonomic classification, functional annotation, network analysis, phylogenetic
reconstruction
File formats: FASTQ, QIIME, MG-RAST, Metagenome Assembly Tool (MAT)
Resources: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3189520/ >,
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4790965/ >

Each of these omics data types has its own specific requirements for analysis and interpretation,
which may involve specialized software tools or expertise in specific analytical techniques.
However, integrating multiomics data can provide a more comprehensive understanding of biological
systems and mechanisms, leading to new insights into disease and potential therapeutic targets.

Manifest

A tab-delimited manifest file allows you to upload and download many data files, and set annotations, at once via the a client (Python, R, or command line clients). Each row in the manifest species the file to be uploaded and the annotations to be applied. To begin, download the manifest template and populate the relevant fields. For instance, you must specify

Template: template_manifest.xlsx

Specify:

path – the current path of the file to upload
parent – the Curator will provide a Synapse ID of the Staging/ folder where the file will be uploaded.

...

(local, server, cloud)
parentID – each file will have a SynapseID for its staging location
Annotations are key-value pairs that associate metadata with a file and help users find and query data. For more info, (see the Synapse Annotation documentation for annotations.)
Provenance is a means of describing a relationship between raw and processed data . For more info, see the Synapse documentation on provenance(see Synapse Provenance documentation). If you are uploading the results of an analysis, you may add a Used column to a manifest to give the Synapse ID(s) of the raw files that went into the analysis. If multiple Synapse IDs should be associated with a processed file, separate them with a semicolon

Once you have populated the manifest fields, export the template worksheet as a plain-text tab-separated file (TSV). With the three metadata CSVs and manifest TSV complete, you are You are now ready to validate the study four metadata files.

Metadata Validation

Use the dccvalidator to validate your metadata files.

Once your validated metadata are ready for upload you will upload them to the parentID provided by your Curator via email.

Validating Metadata and Manifest Files

To standardize data submissions and quality control, we’ve built a metadata validation tool (, dccvalidator) , that will perform several data quality checks on metadata templates and manifest files.

Under Species, select ‘MODEL-AD mouse model’ button.
Under Assay type, select the assay appropriate for your data. If an appropriate assay is unavailable, leave the default selection of rnaSeq.
Use the ‘Browse’ buttons to select the individual animal metadata, biospecimen metadata, assay metadata file, and/or manifest file.

Info
Files that are uploaded to the Metadata Validator will be placed into a private folder on Synapse so that they can be reviewed by the MODEL-AD Data Curation team. They will only be visible to the Data Curation team and will not be shared.

Uploading Study and Assay Descriptions

These documents should be submitted to the dccvalidator to validate the variable keys and controlled vocabulary.

Start by selecting the ‘Study Documentation’ tab on the left side of the window.
Please note, you will be unable see files after you’ve uploaded them.

The data curation team will always look at file versions and date/time stamps and take the most recent version of the file, so if in doubt, please feel free to upload a new version.Validated metadata can be uploaded to the staging location provided by the DCC Curator. See more information about uploading data.

climbDB

https://www.synapse.org/#!Synapse:syn26137185

Versions Compared

Old Version 16

New Version Current

Key

UPDATE - May 2023

Metadata Templates

Individual Animal

Biospecimen

Multiomics refers to the study of multiple biological systems or processes simultaneously using various techniques that generate different types of data. Here is a summary of some common omics data types, their assays, analyses, and file formats:

Manifest

Metadata Validation

Validating Metadata and Manifest Files

Uploading Study and Assay Descriptions

climbDB

Page Comparison

Versions Compared

Old Version 16

New Version Current

Key

UPDATE - May 2023

Metadata Templates

Individual Animal

Biospecimen

Multiomics refers to the study of multiple biological systems or processes simultaneously using various techniques that generate different types of data. Here is a summary of some common omics data types, their assays, analyses, and file formats:

Manifest

Metadata Validation

Validating Metadata and Manifest Files

Uploading Study and Assay Descriptions

climbDB