Note |
---|
Our pipeline configuration is still in-development, and the contents of this document are subject to change. |
Summary of Processing
Data TypeDatatype | Method | Output | Tried yet? |
---|---|---|---|
WES or WGS | DeepVariant | Yes | |
WES or WGS | Strelka, Mutect2, Freebayes | Yes | |
WES or WGS | TBD | Germline and Somatic Structural Variants | No |
WES or WGS | TBD | Germline and Somatic CNV | No |
WES or WGS | TBD | Tumor MSI | No |
SNV, INDEL variants | TBD | No |
Data TypeDatatype | Method | Output | Tried yet? |
---|---|---|---|
RNA-Seq | STAR with Salmon in alignment-based mode | Yes |
...
bam/cram to fastq conversion
When fastq files are not available, cram/bam files are converted to fastq using this pipeline: https://github.com/qbic-pipelines/bamtofastq (v1.2.0).
If unaligned bam files are available instead of fastq files, we recommend providing u-bam files for direct input to sarek 3.0.
WES and WGS Variant Calling (SNV & INDEL)
...
Raw fastq files uploaded to Synapse by researcher in a folder with name format
experiment_name_rnaseq_fastq_date
. No white space should be present in the filenames (all filenames should have_
for whitespaces.All experiment and sample related annotations need to be added on Synapse before processing can start. This is a required step so that a sample sheet can be generated to trigger the processing workflow
The sample sheet should contain the following information in a comma-separated file (
.csv
) with at least 3 columns, and a header row as shown below : . (More information here)
sample | subject | status | sex | file_1 | file_2 | lane | parentId | bed_file | output_parent_Id |
Synapse specimenID | Synapse individualID | 1 (Tumor = 1, Normal 0) | XX or XY |
|
| Lane information | SynapseID of parent folder | Synapse ID of BED file (if WES sata) | Synapse ID of folder where all processed files will be indexed |
...
Currently, germline variant calls in VCF format are being processed manually using VEP and vcf2maf
...
RNA
...
Sequencing Data Quantification
Processing RNA-seq files involve transformation of raw data (fastq
files) to transcript counts (quants.sf
files).
...