Chapter 7 DNAseq alignment
The pipeline tl_bwaMRecal can be used to preprocess the fastq files
from DNA sequencing. It can take paired fastq files, read groups from
multiple batches as input.
## markdup loaded
## inputs:
## outBam (string):
## RG (string):
## threads (int):
## Ref (File):
## FQ1s (File):
## FQ2s (File):
## knowSites:
## type: array
## prefix:
The pipeline includes three steps: BWA alignment, mark duplicate, and base recalibration. The steps can be a single tool or a sub-pipeline that includes several tools each.
## List of length 3
## names(3): bwaAlign markdup BaseRecal
bwaAlign: BWA alignment step is a sub-pipeline which includes the following tools:
## List of length 4
## names(4): bwa sam2bam sortBam idxBam
bwa: to align fastqs and read groups to reference genome withbwa.sam2bam: to convert the alignments from “sam” to “bam” format withsamtools.sortBam: to sort the “bam” file by coordinates withsamtools.idxBam: To index “bam” file withsamtools.
markdup: MarkDuplicates runs a single command line toolPicardthat identifies duplicate reads.
## class: cwlProcess
## cwlClass: CommandLineTool
## cwlVersion: v1.0
## baseCommand: picard MarkDuplicates
## requirements:
## - class: DockerRequirement
## dockerPull: quay.io/biocontainers/picard:2.21.1--0
## inputs:
## ibam (File): I=
## obam (string): O=
## matrix (string): M=
## outputs:
## mBam:
## type: File
## outputBinding:
## glob: $(inputs.obam)
## Mat:
## type: File
## outputBinding:
## glob: $(inputs.matrix)
BaseRecal: Alignment recalibration is a sub-pipeline that runs several tools fromGATKtoolkit.
## List of length 5
## names(5): BaseRecalibrator ApplyBQSR samtools_index samtools_flagstat samtools_stats
BaseRecalibratorandApplyBQSR: alignment recalibration byGATKtoolkit.samtools_index: to index bam file withsamtools.samtools_flagstatandsamtools_stats: to summarize alignments withsamtools.
The output of bwaMRecal pipeline includes the duplicates matrix from
markdup step, final processed bam files and flag summary files from
BaseRecal step.
## outputs:
## BAM:
## type: File
## outputSource: BaseRecal/rcBam
## matrix:
## type: File
## outputSource: markdup/Mat
## flagstat:
## type: File
## outputSource: BaseRecal/flagstat
## stats:
## type: File
## outputSource: BaseRecal/stats