Reduce shard results without gathering all per-shard returns on the master.
shard_reduce() executes map() over shards in parallel and combines results
using an associative combine() function. Unlike shard_map(), it does not
accumulate all per-shard results on the master; it streams partials as chunks
complete.
Usage
shard_reduce(
shards,
map,
combine,
init,
borrow = list(),
out = list(),
workers = NULL,
chunk_size = 1L,
profile = c("default", "memory", "speed"),
mem_cap = "2GB",
recycle = TRUE,
cow = c("deny", "audit", "allow"),
seed = NULL,
diagnostics = TRUE,
packages = NULL,
init_expr = NULL,
timeout = 3600,
max_retries = 3L,
health_check_interval = 10L
)Arguments
- shards
A
shard_descriptorfromshards(), or an integer N.- map
Function executed per shard. Receives shard descriptor as first argument, followed by borrowed inputs and outputs.
- combine
Function
(acc, value) -> accused to combine results. Should be associative for deterministic behavior under chunking.- init
Initial accumulator value.
- borrow
Named list of shared inputs (same semantics as
shard_map()).- out
Named list of output buffers/sinks (same semantics as
shard_map()).- workers
Number of worker processes.
- chunk_size
Shards to batch per worker dispatch (default 1).
- profile
Execution profile (same semantics as
shard_map()).- mem_cap
Memory cap per worker (same semantics as
shard_map()).- recycle
Worker recycling policy (same semantics as
shard_map()).- cow
Copy-on-write policy for borrowed inputs (same semantics as
shard_map()).- seed
RNG seed for reproducibility.
- diagnostics
Logical; collect diagnostics (default TRUE).
- packages
Additional packages to load in workers.
- init_expr
Expression to evaluate in each worker on startup.
- timeout
Seconds to wait for each chunk.
- max_retries
Maximum retries per chunk.
- health_check_interval
Check worker health every N completions.
Value
A shard_reduce_result with fields:
value: final accumulatorfailures: any permanently failed chunksdiagnostics: run telemetry including reduction statsqueue_status,pool_stats