Variant calling pipeline

A Snakemake workflow for calling and annotation of short variants.
Workflow takes paired-end Illumina short read data (fastq files) as input and outputs annotated variant calls in a vcf file as the final result. Input directory contains PE Illumina reads from a publicly available SARS-CoV-2 dataset SRA accession SRR15660643 downsampled to 16000 paired reads (sample.R1.paired.fq.gz and sample.R2.paired.fq.gz).
A fasta file with the Wuhan-Hu-1 reference genome Genbank accession MN908947.3 is included in the
reference directory (MN908947.3.fasta), along with the VEP cache for successful annotation of genomic features.

Usage

git clone https://github.com/LorenaDerezanin/pipeline_test

Step 1: Install Miniconda

Minimal conda installer for running pipeline in an isolated conda environment to avoid dependency hell and ensure reproducibility.

Step 2 (Recommended): Install mamba - faster package manager

conda install mamba -n base -c conda-forge

Recommended installation to speed up env setup. Mamba is a more robust and faster package manager (parallel download of data), and handles releases and dependencies better than conda. If continuing with conda, mamba command should be replaced with conda in Step 3.

Step 3: Recreate conda environment

cd pipeline_test/

mamba env create -n snek -f envs/snek.yml

Step 4: Activate environment

conda activate snek

Step 5: Run pipeline

snakemake --use-conda --cores 4 --verbose

Number of suggested --cores when running pipeline locally, should be increased if running on a cluster.

Troubleshooting

If conda fails to install snakemake v.6.15, install snakemake with mamba: mamba install snakemake.

Pipeline content

Bioinformatics tools used in the Snakemake workflow, in the form of snakemake wrappers obtained from The Snakemake Wrappers Repository:

fastQC
multiQC
trim_galore
bwa
samtools
picard
freebayes
bcftools
vep

to do:
- Docker container + conda/mamba
- AWS/Google cloud deployment
- unit tests

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
envs		envs
input		input
reference		reference
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
results_report.txt		results_report.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Variant calling pipeline

Usage

Step 1: Install Miniconda

Step 2 (Recommended): Install mamba - faster package manager

Step 3: Recreate conda environment

Step 4: Activate environment

Step 5: Run pipeline

Troubleshooting

Pipeline content

About

Releases

Packages

Languages

LorenaDerezanin/pipeline_test

Folders and files

Latest commit

History

Repository files navigation

Variant calling pipeline

Usage

Step 1: Install Miniconda

Step 2 (Recommended): Install mamba - faster package manager

Step 3: Recreate conda environment

Step 4: Activate environment

Step 5: Run pipeline

Troubleshooting

Pipeline content

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages