WGS VARIANT CALLS:

TBD


WES VARIANT CALLS:

A. Germ-line Variant Calling using DeepVariant+Sarek on Nextflow:

This involves transformation of WES fastq or cram files to variant call files in VCF format (.vcf files).

As of Jan 2022, the reference genome used is GRCh38.

The processing steps include the following:

sample

subject

status

sex

file_1

file_2

lane

parentId

Synapse specimenID

Synapse individualID

1 (Tumor = 1, Normal 0)

XX or XY

synID

synID

Lane information

Synapse folder information

nf-core/sarek	v2.7.1
Nextflow	v21.10.5
BWA	<span style="color:#999999;">N/A</span>
GATK	v4.1.7.0
FreeBayes	v1.3.2
samtools	v1.9
Strelka	v2.9.10
Manta	v1.6.0
TIDDIT	v2.7.1
AlleleCount	v4.0.2
ASCAT	v2.5.2
Control-FREEC	vv11.6
msisensor	v0.5
SnpEff	v4.3t
VEP	v99.2
MultiQC	v1.8
FastQC	v0.11.9
bcftools	v1.9
CNVkit	v0.9.6
htslib	v1.9
QualiMap	v2.2.2-dev
Trim Galore	v0.6.4_dev
vcftools	v0.1.16
R	v4.0.2

Estimated costs for germ-line variant calling (per 50 samples):

B. Somatic Variant Calling using Strelka+Mutect2+Freebayes on Nextflow:

TBD

C. Variant Annotation:

Currently germ-line variant calls in VCF format are being processed manually using VEP and vcf2maf

Estimated costs for germ-line variant annotation (per 50 samples) using VEP:


RNA SEQUENCING DATA QUANTIFICATION:

Processing RNA-seq files involve transformation of raw data (fastq files) to transcript counts (quants.sf files).

The quantification software of choice is Salmon.

As of Jan 2022, the reference genome used is GRCh38.

Processing involves the following steps:

sample

single_end

fastq_1

fastq_2

strandedness

sample_id_1

0 (1 if paired-end)

synID

synID

reverse or forward

sample_id_2

Estimated costs for processing: