Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Requirement

Levelsa

Format

Notes

DNA

whole genome sequencing

required

raw OR semi-processed

raw: FASTQ, unaligned BAM, CRAM | semi-processed: aligned BAM

whole exome sequencing

required

raw OR semi-processed

raw: FASTQ, unaligned BAM, CRAM | semi-processed: aligned BAM

SNP microarray

required

raw AND processed

raw: CEL, IDAT, tsv (raw values per SNP)

processed: tsv (genotypes per SNP)

immunosequencing

required

raw OR semi-processed

vendor-dependent, e.g. ImmunoSEQ and 10XGenomics formats

Sanger sequencing

optional

processed

RNA expression

RNA sequencing (bulk)

required

raw OR semi-processed AND processed

raw: FASTQ, unaligned BAM, CRAM | semi-processed: aligned BAM

processed: counts matrices or quantification files

quantification files: like the quant.sf files generated by Salmon-based RNA-seq workflows

RNA sequencing (single-cell)

required

raw AND processed

raw: FASTQ

processed: hda5/hdf5 format following cellxgene required format

fastq should be created from bcl files with a program like cellranger mkfastq

More documentation on formatting hda5 files can be found here. hda5 format is a type of hdf5 file.

gene expression microarray

required

raw AND processed

raw: CEL, IDAT, tsv (raw values per SNP, copy number, and loss of heterozygosity)

processed: tsv (normalized values and purity/ploidy)

qPCR

optional

processed

csv/tsv (according to template)

methylation

ATAC sequencing

required

raw OR semi-processed

raw: FASTQ, unaligned BAM, CRAM | semi-processed: aligned BAM

methylation array

required

raw OR semi-processed

raw: FASTQ, unaligned BAM, CRAM | semi-processed: aligned BAM

bisulfite sequencing

required

raw OR semi-processed

raw: FASTQ, unaligned BAM, CRAM | semi-processed: aligned BAM

protein

LC-MS

required

raw AND processed

raw: mzML

processed: protein intensities (csv/tsv)

https://www.psidev.info/mzML

western blot

optional

processed

densitometry output (csv/tsv)

plate-based ELISA

optional

raw

plate reader output (csv/tsv)

protein/peptide microarrays

required

processed

label-free quantification matrix (csv/tsv)

metabolomics

LC-MS

required

raw AND processed

raw: mzML or vendor-dependent format & processed: metabolite intensities (csv/tsv)

clinical

structured clinical data

required

processed

csv/tsv or XML with metadata for each variable

key primary and secondary endpoints only

EEG

required

raw

pending additional comments

clinical/imaging

MRI or other radiological image

required

raw

dicom, nifti, mincDICOM

imaging

immunohistochemistry

required

raw

OME-TIFF (preferred), at least bio-formats compatible file format

immunofluorescence

required

raw

OME-TIFF (preferred), at least bio-formats compatible file format

gross morphology photos (mice)

optional

raw

tiff, png, jpg

in vitro drug screening

plate-based cell viability assay

required

processed

csv/tsv (according to template)

other

flow cytometry

optional

raw

fsc with gating parameters

in vivo tumor growth experiments

optional

raw OR processed

csv/tsv (according to template) where raw: tumor dimensions or other raw measurements & processed: calculated tumor volume/size

aLevel nomenclature can be cross-referenced with https://gdc.cancer.gov/resources-tcga-users/tcga-code-tables/data-levels , where 'raw' corresponds to Level 1 and 'semi-processed' most closely corresponds to Level 2.

...