Chapter 7 DNAseq alignment
The pipeline tl_bwaMRecal
can be used to preprocess the fastq files
from DNA sequencing. It can take paired fastq files, read groups from
multiple batches as input.
## markdup loaded
## inputs:
## outBam (string):
## RG (string):
## threads (int):
## Ref (File):
## FQ1s (File):
## FQ2s (File):
## knowSites:
## type: array
## prefix:
The pipeline includes three steps: BWA alignment, mark duplicate, and base recalibration. The steps can be a single tool or a sub-pipeline that includes several tools each.
## List of length 3
## names(3): bwaAlign markdup BaseRecal
bwaAlign
: BWA alignment step is a sub-pipeline which includes the following tools:
## List of length 4
## names(4): bwa sam2bam sortBam idxBam
bwa
: to align fastqs and read groups to reference genome withbwa
.sam2bam
: to convert the alignments from “sam” to “bam” format withsamtools
.sortBam
: to sort the “bam” file by coordinates withsamtools
.idxBam
: To index “bam” file withsamtools
.
markdup
: MarkDuplicates runs a single command line toolPicard
that identifies duplicate reads.
## class: cwlProcess
## cwlClass: CommandLineTool
## cwlVersion: v1.0
## baseCommand: picard MarkDuplicates
## requirements:
## - class: DockerRequirement
## dockerPull: quay.io/biocontainers/picard:2.21.1--0
## inputs:
## ibam (File): I=
## obam (string): O=
## matrix (string): M=
## outputs:
## mBam:
## type: File
## outputBinding:
## glob: $(inputs.obam)
## Mat:
## type: File
## outputBinding:
## glob: $(inputs.matrix)
BaseRecal
: Alignment recalibration is a sub-pipeline that runs several tools fromGATK
toolkit.
## List of length 5
## names(5): BaseRecalibrator ApplyBQSR samtools_index samtools_flagstat samtools_stats
BaseRecalibrator
andApplyBQSR
: alignment recalibration byGATK
toolkit.samtools_index
: to index bam file withsamtools
.samtools_flagstat
andsamtools_stats
: to summarize alignments withsamtools
.
The output of bwaMRecal
pipeline includes the duplicates matrix from
markdup
step, final processed bam files and flag summary files from
BaseRecal
step.
## outputs:
## BAM:
## type: File
## outputSource: BaseRecal/rcBam
## matrix:
## type: File
## outputSource: markdup/Mat
## flagstat:
## type: File
## outputSource: BaseRecal/flagstat
## stats:
## type: File
## outputSource: BaseRecal/stats