Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Table of Contents
minLevel3
maxLevel3
outlinefalse
typelist
printablefalse

The NF-OSI is piloting a new initiative to uniformly process genomic and transcriptomic data that is shared on the NF Data Portal. All studies funded by the Neurofibromatosis Therapeutic Acceleration Program uploaded within the duration 2018 to 2023 are included in the scope of this initiative. The processing of high dimensional genomic and transcriptomic data will be done using standardized data processing pipelines.

The datatypes in the scope of processing include:

  • Whole Exome Sequencing

  • Whole Genome Sequencing

  • Bulk RNA Sequencing

  • Single cell RNA sequencing

The uniformly processed data will be shared on the NF Data Portal and will also be used to support the use of data from the NF Data Portal in other data analysis/exploration platforms. 

Thanks to the generosity of the NF-OSI funders, a subset of the NF Data Portal data is currently eligible for reprocessing. If you have a dataset that you have shared on the NF Data Portal and would like to utilize the NF-OSI Processing pipelines, please reach out to us at nf-osi@sagebionetworks.org. We would be happy to assist you in a case-by-case basis.

Eligibility criteria for data files to be staged for processing

Table of Contents
minLevel4
maxLevel4
outlinefalse
typeflat
printablefalse
separatorpipe

Data files must be annotated with the following details:

For each of the following assays, data files must be annotated with the terms listed below to be staged for processing.

Bulk and Single Cell RNA Sequencing

Annotation term
Additional Details
1
fileFormat
Accepted file formats include "fastq", "bam", and "cram". If provided raw data are “bam” or “cram” format, only files that have not undergone any additional filtering (i.e. retains unmapped reads, have not been trimmed, etc) will be eligible for processing.
2
individualID
Individual IDs are necessary to create the sample sheets.
3
specimenID
specimen IDs are necessary to interpret the analysis.
4
Assay
Choose between Bulk RNA Seq or Single Cell RNA Seq.
5
Species
The corresponding genome requires knowledge of the species.
6
libraryPreparationMethod
This refers to the name of the library preparation, such as KAPA Hyper PCR 3.
7
Platform
This refers to the name of the platform used, for example, illumina.
8
readPair
Specify whether the read pair is 1 or 2.
9
specimenPreparationMethod
Minimize RNA degradation with methods such as flash freezing or RNALater. FFPE is not recommended.
10
tumorType
If the tissue is normal, indicate "not applicable." Otherwise, specify the tumor type.
11
isStranded*
This answer should be either "yes" or "no."
12
readPairOrientation*
Indicate the read pair orientation, such as forward or reverse.

* optional but recommended

...

Whole Genome Sequencing

Annotation term
Additional Details
1
fileFormat
Accepted file formats include "fastq", "bam", and "cram". If provided raw data are “bam” or “cram” format, only files that have not undergone any additional filtering (i.e. retains unmapped reads, have not been trimmed, etc) will be eligible for processing.
2
individualID
Individual IDs are necessary to create the sample sheets.
3
specimenID
specimen IDs are necessary to interpret the analysis.
4
Assay
Choose between Bulk RNA Seq or Single Cell RNA Seq.
Whole Genome Sequencing
5
Species
The corresponding genome requires knowledge of the species.
6
libraryPreparationMethod
This refers to the name of the library preparation, such as KAPA Hyper PCR 3.
7
Platform
This refers to the name of the platform used, for example, illumina.
8
readPair
Specify whether the read pair is 1 or 2.
9
specimenPreparationMethod
Minimize RNA degradation with methods such as flash freezing or RNALater. FFPE is not recommended.
10
tumorType
If the tissue is normal, indicate "not applicable." Otherwise, specify the tumor type. NOTE: Files from samples lacking tumor-normal pairs will not be eligible for Somatic variant calls or for microsatellite instability processing.
11
isStranded*
This answer should be either "yes" or "no."
12
readPairOrientation*
Indicate the read pair orientation, such as forward or reverse.

* optional but recommended

...

Whole Exome Sequencing

Choose between Bulk RNA Seq or Single Cell RNA Seq.
Annotation term
Additional Details
1
fileFormat
Accepted file formats include "fastq", "bam", and "cram". If provided raw data are “bam” or “cram” format, only files that have not undergone any additional filtering (i.e. retains unmapped reads, have not been trimmed, etc) will be eligible for processing. Note: WES files are not eligible for variant calling if BED file is not available
2
individualID
Individual IDs are necessary to create the sample sheets.
3
specimenID
specimen IDs are necessary to interpret the analysis.
4
Assay
Whole Exome Sequencing
5
Species
The corresponding genome requires knowledge of the species.
6
libraryPreparationMethod
This refers to the name of the library preparation, such as KAPA Hyper PCR 3.
7
Platform
This refers to the name of the platform used, for example, illumina.
8
readPair
Specify whether the read pair is 1 or 2.
9
specimenPreparationMethod
Minimize RNA degradation with methods such as flash freezing or RNALater. FFPE is not recommended.
10
tumorType
If the tissue is normal, indicate "not applicable." Otherwise, specify the tumor type. NOTE: Files from samples lacking tumor-normal pairs will not be eligible for Somatic variant calls or for microsatellite instability processing.
11
isStranded*
This answer should be either "yes" or "no."
12
readPairOrientation*
Indicate the read pair orientation, such as forward or reverse.

* optional but recommended

...

Note

Additional requirement:

The

...

Assay

...

Germline SNV

...

Somatic SNV

...

Copy Number Variation (CNV)

...

Structural variants (SV)

...

Microsatellite Instability (MSI)

...

Raw counts

...

WES

...

...

...

...

...

...

NA

...

WGS

...

...

...

...

...

...

NA

...

Bulk RNAseq

...

NA

...

NA

...

NA

...

NA

...

NA

...

...

Single Cell RNAseq

...

NA

...

NA

...

NA

...

NA

...

NA

...

✅ The workflow is Available for this datatype

❌ The workflow is available for this data type, but the NF-OSI will not provide this processing. This decision follows from the recommendation of scientists and engineers at Sage who have worked with these data modalities and have noted various problems in interpretation of processed data from these workflows during downstream analysis. 

...

BED file associated with the library preparation method for each WES dataset is required to be uploaded and available to the NF-OSI Sage Team for the dataset to be eligible for processing.