Authors: Qian Liu 1, Another Author2.
Last modified: July 27, 2023.
The workshop format is a 45 minute session consisting of hands-on demos, exercises and Q&A.
For the somatic variant calling, we will need to prepare the following:
.bam
, .bam.bai
filesb37
or hg38
)Mutect2
to Call somatic SNVs and indels via local assembly of haplotypes. ref
We also want to have the data analysis workflow to be reproducible:
The first can be solved by workflow languages (e.g., CWL, WDL, snakemake, etc.). There is no similar tools for the 2nd task.
In this workshop, I will demostrate two Bioconductor packages: Rcwl
as an R interface for CWL
, and RcwlPipelines
for >200 pre-built bioinformatics tools and best practice pipelines in R, that are easily usable and highly customizable. I will also introduce a R/Bioconductor package ReUseData
for the management of reusable genomic data.
With these tools, we should be able to conduct reproducible data analysis using commonly used bioinformatics tools (including command-line based tools and R/Bioconductor packages) and validated, best practice workflows (based on workflow languages such as CWL) within a unified R programming environment.