Constructor function of data recipe
Arguments
- shscript
character string. Can take either the file path to the user provided shell script, or directly the script content, that are to be converted into a data recipe.
- paramID
Character vector. The user specified parameter ID for the recipe.
- paramType
Character vector specifying the type for each
paramID. One parameter can be of multiple types in list. Valid values are "int" for integer, "boolean" for boolean, "float" for numeric, "File" for file path, "File[]" for an array of files, etc. Can also take "double", "long", "null", "Directory". See details.- outputID
the ID for each output.
- outputType
the output type for each output.
- outputGlob
the glob pattern of output files. E.g., "hg19.*".
- requireTools
the command-line tools to be used for data processing/curation in the user-provided shell script. The value here must exactly match the tool name. E.g., "bwa", "samtools", etc. A particular version of that tool can be specified in the format of "tool=version", e.g., "samtools=1.3".
Value
a data recipe in cwlProcess S4 class with all details
about the shell script for data processing/curation, inputs,
outputs, required tools and corresponding docker files. It is
readily taken by getData() to evaluate the shell scripts
included and generate the data locally. Find more details with
?Rcwl::cwlProcess.
Details
For parameter types, more details can be found here: "https://www.commonwl.org/v1.2/CommandLineTool.html#CWLType".
recipeMake is a convenient function for wrapping a shell script
into a data recipe (in cwlProcess S4 class). Please use
Rcwl::cwlProcess for more options and functionalities,
especially when the recipe gets complicated, e.g., needs a
docker image for a command-line tool, or one parameter takes
multiple types, etc. Refer to this recipe as an example:
https://github.com/rworkflow/ReUseDataRecipe/blob/master/reference_genome.R
Examples
if (FALSE) {
library(Rcwl)
##############
### example 1
##############
script <- "
input=$1
outfile=$2
echo \"Print the input: $input\" > $outfile.txt
"
rcp <- recipeMake(shscript = script,
paramID = c("input", "outfile"),
paramType = c("string", "string"),
outputID = "echoout",
outputGlob = "*.txt")
inputs(rcp)
outputs(rcp)
rcp$input <- "Hello World!"
rcp$outfile <- "outfile"
res <- getData(rcp, outdir = tempdir(),
notes = c("echo", "hello", "world", "txt"),
showLog = TRUE)
readLines(res$out)
##############
### example 2
##############
shfile <- system.file("extdata", "gencode_transcripts.sh", package = "ReUseData")
readLines(shfile)
rcp <- recipeMake(shscript = shfile,
paramID = c("species", "version"),
paramType = c("string", "string"),
outputID = "transcripts",
outputGlob = "*.transcripts.fa*",
requireTools = c("wget", "gzip", "samtools")
)
Rcwl::inputs(rcp)
rcp$species <- "human"
rcp$version <- "42"
res <- getData(rcp,
outdir = tempdir(),
notes = c("gencode", "transcripts", "human", "42"),
showLog = TRUE)
res$output
dir(tempdir())
}