Add a script-based stage to a parade flow
script_stage.RdWraps an external script as a first-class pipeline stage with declarative
output file templates and automatic parameter wiring. The script is expected
to write its output files; script_stage() verifies they exist and returns
file reference tibbles compatible with artifact() / file_ref().
Arguments
- fl
A
parade_flowobject- id
Unique stage identifier (character)
- script
Path to the script file (R, Python, bash, etc.)
- produces
Character vector of output declarations. Two forms:
Templates (contain glue braces): glue-style path templates resolved per grid row. Can be named or unnamed (unnamed single defaults to
"output").Names only (no braces): output names only. The script must call
script_returns()to declare actual paths.
- needs
Character vector of upstream stage IDs this stage depends on. For the
sourceengine, upstream outputs are injected into the script environment as{stage}.{field}variables.- engine
Execution engine:
"source"(default, usesbase::source()) or"system"(usesbase::system2()).- interpreter
For
engine = "system", the interpreter command. IfNULL, guessed from the script file extension.- prefix
Whether to prefix output columns with stage ID (default
TRUE).- ...
Additional constant arguments passed through to the stage.
Two modes for produces
Template mode — values contain glue placeholders (curly braces).
script_stage() resolves paths, injects them as variables
(output_path, <name>_path), and verifies the files after the
script finishes:
produces = c(model = "results/\{subject\}/model.rds")
Manifest mode — values are plain output names (no braces).
The script decides where to write and calls script_returns() to
declare the paths. script_stage() reads the manifest and verifies:
produces = c("model", "metrics")
Portable scripts with get_arg()
Scripts can use get_arg() to read parameters. It works transparently
with both the source and system engines:
Examples
if (FALSE) { # \dontrun{
# Template mode: caller declares paths
flow(grid) |>
script_stage("fit",
script = "scripts/fit_model.R",
produces = "results/{subject}/model.rds"
)
# Template mode: multiple named outputs
flow(grid) |>
script_stage("fit",
script = "scripts/fit_model.R",
produces = c(
model = "results/{subject}/model.rds",
metrics = "results/{subject}/metrics.csv"
)
)
# Manifest mode: script declares paths via script_returns()
flow(grid) |>
script_stage("fit",
script = "scripts/fit_model.R",
produces = c("model", "metrics")
)
# System engine
flow(grid) |>
script_stage("preproc",
script = "scripts/preprocess.py",
engine = "system",
produces = "output/{subject}.nii.gz"
)
} # }