Experimental data is often shared in a variety of formats. Carefully choosing a data format is a great way to extend the impact of your research, by ensuring others can use it in the future.
On this page, we begin by defining the difference between raw data and results. Then, we provide a reference table for different types of data that you might be sharing, followed by a breakdown of the information that we require when uploading your data.
Raw data vs. results
From a reusability perspective, data is the most useful to future users. Both results and data can be shared, but data is more important for reproducibility and reuse.
We consider data to be raw or partially processed information from a single sample, depending on the type of experiment.
Results are generally post-analysis information from an aggregate of samples or manuscript figures.
For example, if you are sharing gene expression information, raw data would be the raw, zipped, fastq.gz files, while differential expression analysis and volcano plots would be considered results. This distinction is well defined for many types of data, but for assays we encounter less often this may be less clear. "Results" might also be acceptable for assays that do not lend themselves to re-analysis, such as western blotting. We can work with you to help figure this out.
Data reference table
Assay | Preferred file formats |
RNA-seq | .fastq.gz, .bam, .cram |
microarray | platform dependent (e.g. .cel, .idat) |
whole-genome sequencing | .fastq.gz, .bam, .cram |
whole-exome sequencing | .fastq.gz, .bam, .cram (.vcf, .maf acceptable with justification) |
methylation data | Platform dependent: .idat for Illumina arrays, .fastq.gz for sequencing |
western blotting | Individual images or final figure (e.g. samp1.png or samp1.tif) |
microscopy | Microscope’s native imaging format (e.g. nd2, abi), OME-TIFF, lossless image files (e.g. .tif, .png) separated by channel |
In vitro drug screening data | A .csv/.tsv file following this template (see instructions) |
In vivo tumor growth experiments | A .csv/.tsv file following this template (see instructions) |
PK/PD data | .csv/.tsv file - no template exists |
proteomics | platform dependent |
Data upload requirements
To share your data on the NF Data Portal, we require the following information:
For individual files:
Annotations as defined in the manifest. Only files annotated with the resourceType
marked as experimentalData
are visible in the portal.
For any multi-file datasets:
Dataset Name: <100 characters
Summary: <1000 characters summarizing the dataset
Dataset ID: synapse ID of the dataset
Files: number of files in the dataset
Size: size of the dataset in bytes
Disease Focus: NF1, NF2 or Schwannomatosis
Manifestation: disease manifestation(s) under study - for example, the type of tumor
Funding Agency: Funding organization for the dataset
Study Name: <100 characters - same as the study name
For any publications or preprints that are linked to this study:
Pubmed ID:
DOI (if available):
Disease Focus: NF1, NF2 or Schwannomatosis
Manifestation: disease manifestation(s) under study - for example, the type of tumor
Funding Agency: Funding organization for the publication
For your study:
Note: these are already listed on the portal—please provide the following information if you’d like to modify what is already on the portal
Study Name: <100 characters
Study Status: "Active" or "Completed"
Data Status: "None," "Under Embargo," "Partially Released," or "Published"
Funding Agency: Funding organization for the study
Summary: <1000 characters summarizing the study
Study Leads: PIs and/or key personnel for the study
Institutions: institution(s) at which the work was done
Manifestation: disease manifestation(s) under study - for example, the type of tumor
Disease Focus: NF1, NF2 or Schwannomatosis
Please send this information to nf-osi@sagebionetworks.org
0 Comments