Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Metadata

...

Studies require four metadata files is standardized information about your data and will be used to annotate the data files in Synapse. Contributors must submit these metadata files for each study: Individual, Biospecimen, Assay, and Manifest files. . These are described below with links to the latest templates.

Download the metadata templates and populate the fields relevant for your individual subjects, biospecimens, and assays. The metadata templates provide guidance for allowed variable keys and values on the dictionary and values worksheets. The AD Metadata Dictionary includes the latest information about allowed values. 

In order for your metadata to validate successfully, start with download the latest Metadata Templates, populate with the relevant information and validate. Once you have prepared your populated the metadata , you are ready to validate it following the instructions below.fields, export the template worksheet from each file as a plain-text comma-separated file (CSV). and follow the validation instructions below.

Metadata Templates

Individual Animal

This file contains metadata about each individual animal that is part of a study.  Each animal will be described in one row with information that is true of the animal as a whole (eg, individualID, genotype).

Template: template_individual_animal_model-ad.xlsx

...

A biospecimen is a sample of cells, tissue, RNA, DNA, etc.  The biospecimen This metadata file contains information about each biospecimen that is part of a study, including things details like what organ and tissue the specimen is from, its mass, etc. 

The biospecimen and individual animal metadata files are linked by the individualID variable and it should be . Verify these values are consistent across these two files.  Each individualID may have more than one associated specimenID.  Not all data will have an associated biospecimen – for instance behavioral or imaging studies may only have records in the individual animal metadata file.

...

Template: template_biospecimen.xlsx

...

Assays

Each assay metadata file contains information about the assay and there are multiple templates , since the information collected will vary by assay. 

...

Not all assays have related assay metadata templates, but let the DCC know if you would like to collaborate on the development of new templates.

...

Here’s a graphical representation of the relationship between individualID and specimenID across the different metadata files.  IndividualID should be consistent across the individual animal and biospecimen metadata templates.  SpecimenID should be consistent between the biospecimen and assay metadata files.

...

Manifest

A tab-delimited manifest file allows you to upload and download many data files, and set annotations, at once via the Python, R, or command line clients. Each row in the manifest species the file to be uploaded and the annotations to be applied. To begin, download the manifest template and populate the relevant fields.

For instance, you must specify:

  • path – the current path of the file to upload

  • parent – the Curator will provide a Synapse ID of the Staging/ folder where the file will be uploaded.

You may also specify other annotations:

  • Annotations are key-value pairs that associate metadata with a file and help users find and query data.  For more info, see the Synapse documentation for annotations.

  • Provenance is a means of describing a relationship between raw and processed data.  For more info, see the Synapse documentation on provenance.  If you are uploading the results of an analysis, you may add a Used column to a manifest to give the Synapse ID(s) of the raw files that went into the analysis.  If multiple Synapse IDs should be associated with a processed file, separate them with a semicolon

Once you have populated the manifest fields, export the template worksheet as a plain-text tab-separated file (TSV).

With the three metadata CSVs and manifest TSV complete, you are ready to validate the study metadata.

Metadata Validation

Use the dccvalidator to validate your metadata files.

Once your validated metadata are ready for upload you will upload them to the parentID provided by your Curator via email.

Validating Metadata and Manifest Files

To standardize data submissions and quality control, we’ve built a metadata validation tool (dccvalidator) that will perform several data quality checks on metadata templates and manifest files. 

  • Under Species, select ‘MODEL-AD mouse model’ button.

  • Under Assay type, select the assay appropriate for your data. If an appropriate assay is unavailable, leave the default selection of rnaSeq.

  • Use the ‘Browse’ buttons to select the individual animal metadata, biospecimen metadata, assay metadata file, and/or manifest file.

Info

Files that are uploaded to the Metadata Validator will be placed into a private folder on Synapse so that they can be reviewed by the MODEL-AD Data Curation team.  They will only be visible to the Data Curation team and will not be shared.

Uploading Study and Assay Descriptions

These documents should be submitted to the dccvalidator to validate the variable keys and controlled vocabulary.

  • Start by selecting the ‘Study Documentation’ tab on the left side of the window. 

  • Please note, you will be unable see files after you’ve uploaded them. 

The data curation team will always look at file versions and date/time stamps and take the most recent version of the file, so if in doubt, please feel free to upload a new version.