Feature-sets design (mvpa_design extension for continuous regression)
Source:R/feature_sets_design.R
feature_sets_design.Rd`feature_sets_design()` defines a small `mvpa_design` extension that attaches grouped continuous predictors for stimulus->brain regression.
Usage
feature_sets_design(
X_train,
X_test = NULL,
block_var_test = NULL,
target_builder = NULL,
target_builder_data = NULL,
n_test = NULL,
split_by = NULL
)Arguments
- X_train
A `feature_sets` object (train/source predictors).
- X_test
Optional `feature_sets` object (test/target predictors).
- block_var_test
Optional integer/factor vector of length T_rec defining test/target blocks (typically runs). Used by models that do test-time CV.
- target_builder
Optional function that rebuilds target-domain predictors per outer target fold. It is called with named arguments including `X_train`, `X_test`, `train_idx`, `test_idx`, `fold_id`, `block_var_test`, `n_test`, `builder_data`, and `design`; callbacks may accept only the subset they need. The return value must resolve to target predictors with the same feature/set layout as `X_train`.
- target_builder_data
Optional object passed through to `target_builder` as `builder_data`.
- n_test
Optional target-domain row count used when `X_test` is not supplied. If omitted, it is inferred from `X_test` or `block_var_test`.
- split_by
Optional split definition passed to `mvpa_design`.
Details
This mirrors the pattern used by the archived `hrfdecoder` design API: rMVPA's core infrastructure expects `mvpa_design` fields like `train_design`, `y_train`, etc. For continuous regression with external test domains, those fields are largely bookkeeping, while the actual predictors are carried explicitly as `feature_sets` objects.
Where the data live.
Predictors live on the design: `design$X_train` and `design$X_test` (both `feature_sets`).
Responses live on the dataset: `dataset$train_data` and `dataset$test_data` (TRxvoxel matrices per ROI).
cv_labels semantics. The returned object passes `cv_labels = 1:T_enc` to `mvpa_design()`. These integer indices are used only for length validation and fold-construction bookkeeping in the generic rMVPA iterators; they are not meaningful training targets. The actual predictors are carried as `design$X_train` / `design$X_test`, and models built for this design (e.g. `grouped_ridge_da_model()` / `banded_ridge_da_model()`) retrieve them from those fields rather than from `cv_labels`.
Recall blocks. `block_var_test` is stored for convenience as `design$block_var_test`. Models can use it to define blocked cross-validation on test/target time (e.g. leave-one-run-out), and fall back to contiguous folds when only a single run is present.
Fold-aware target builders. For unbiased domain-adaptation workflows, `feature_sets_design()` can store a `target_builder` callback instead of a single fixed `X_test`. The callback is invoked separately for each outer target fold and receives the source predictors plus the target train/test row indices. It must return target-side predictors in the original target row order, either as a `feature_sets` object, a numeric matrix, or a list containing `gamma`, `X`, or `X_test`. This allows upstream matching/alignment to be re-fit on target-train rows only, while held-out target rows remain untouched until evaluation.
Examples
# Train predictors (TR x features), split into named sets:
X_enc <- matrix(rnorm(20 * 8), 20, 8)
fs_enc <- feature_sets(X_enc, blocks(low = 3, sem = 5))
# Test predictors (TR x features), for example from a soft alignment:
gamma <- matrix(runif(10 * 20), 10, 20)
gamma <- gamma / rowSums(gamma)
fs_rec <- expected_features(fs_enc, gamma, drop_null = FALSE, renormalize = TRUE)
des <- feature_sets_design(fs_enc, fs_rec, block_var_test = rep(1:2, each = 5))