MVPA Searchlight Tutorial
Your Name
2025-09-28
CommandLineScripts.RmdIntroduction
This tutorial explains how to run a searchlight-based Multivariate Pattern Analysis (MVPA) using MVPA_Searchlight.R. The script performs a local classification or regression analysis on fMRI data by iterating over each voxel (or node for surface data) and extracting information from a surrounding neighborhood.
Key features
The script handles both volumetric (NIfTI) and surface data and can
parallelize across cores. You can select classifiers and regressors such
as rf, sda_notune, and corsim,
enable feature selection, and choose cross‑validation schemes that
respect run structure. Outputs include performance and probability maps
together with a complete configuration file for reproducibility.
Optional normalization (centering/scaling) is available in both data
modes.
Running the Script
1. Basic Usage
If you have:
- A 4D fMRI file for training:
train_data.nii - A trial-by-trial design matrix:
train_design.txt - A brain mask file:
mask.nii
You can run the script from the command line:
2. Data modes
The script supports two primary data modes:
3. Models
The script supports various classification and regression models:
Built‑in MVPA models
-
corclass: Correlation-based classifier with template matching -
sda_notune: Simple Shrinkage Discriminant Analysis without tuning -
sda_boot: SDA with bootstrap resampling -
glmnet_opt: Elastic net with EPSGO parameter optimization -
sparse_sda: SDA with sparsity constraints -
sda_ranking: SDA with automatic feature ranking -
mgsda: Multi-Group Sparse Discriminant Analysis -
lda_thomaz: Modified LDA for high-dimensional data -
hdrda: High-Dimensional Regularized Discriminant Analysis
You can also register custom models via
register_mvpa_model().
4. Cross‑validation options
The script supports multiple cross-validation strategies:
Advanced cross‑validation methods
Beyond standard blocked and k‑fold splits, you can use bootstrap
blocked CV (resampling within runs), sequential blocked CV (ordered
folds), or provide custom train/test indices. Specify the method in the
config file under cross_validation.name. For example:
cross_validation:
name: "bootstrap" # Options: "twofold", "bootstrap", "sequential", "custom", "kfold"
nreps: 10Choose the method that best matches your data structure and experimental design.
6. Understanding label_column
The label column is critical as it specifies the target variable for classification or regression.
- If performing classification, this column should
contain categorical labels (e.g.,
"Face"vs."House"). - If performing regression, this column should contain continuous values (e.g., reaction times, confidence ratings).
Example Design File
(train_design.txt):
trial condition subject session
1 Face S01 1
2 House S01 1
3 Face S01 1
4 House S01 1
5 Face S01 2
7. Using a Configuration File
Instead of specifying all options on the command line, you can use a YAML or R script configuration file.
Example YAML Config File
(config.yaml):
# Data Sources
train_design: "train_design.txt"
test_design: "test_design.txt"
train_data: "train_data.nii"
test_data: "test_data.nii"
mask: "mask.nii"
# Analysis Parameters
model: "rf" # Random Forest classifier
data_mode: "image" # or "surface"
ncores: 4
radius: 6
label_column: "condition"
block_column: "session"
# Output Options
output: "searchlight_results"
normalize_samples: TRUE
class_metrics: TRUE
# Advanced Options
feature_selector:
method: "anova"
cutoff_type: "percentile"
cutoff_value: 0.1
cross_validation:
name: "twofold"
nreps: 10
# Optional Subsetting
train_subset: "subject == 'S01'"
test_subset: "subject == 'S02'"Running with a Config File:
8. Expected Outputs
After running the script, the output directory
(searchlight_results/) contains:
-
Performance Maps: NIfTI files for each performance
metric
-
accuracy.nii: Overall classification accuracy map -
auc.nii: Area Under Curve (AUC) performance map - For multiclass problems with
class_metrics: TRUE:-
auc_class1.nii,auc_class2.nii, etc.: Per-class AUC maps
-
-
-
Probability Maps: When available
-
prob_observed.nii: Probabilities for observed classes -
prob_predicted.nii: Probabilities for predicted classes
-
-
Configuration
-
config.yaml: Complete record of analysis parameters for reproducibility
-
Example directory structure:
searchlight_results/
├── accuracy.nii # Overall classification accuracy
├── auc.nii # Mean AUC across classes
├── auc_class1.nii # AUC for class 1 (if class_metrics: TRUE)
├── auc_class2.nii # AUC for class 2 (if class_metrics: TRUE)
├── prob_observed.nii # Probabilities for observed classes
├── prob_predicted.nii # Probabilities for predicted classes
└── config.yaml # Analysis configuration
The exact files will depend on: - Whether it’s a binary or multiclass
classification - If class_metrics: TRUE is set - The type
of analysis (classification vs regression) - The model type used
For regression analyses, you’ll see different metrics: -
r2.nii: R-squared values - rmse.nii: Root Mean
Square Error - spearcor.nii: Spearman correlation
9. Performance Considerations
- Use
--normalize_samples=TRUEfor better model performance - Increase
--ncoresfor faster processing on multi-core systems - Adjust
--radiusbased on your spatial resolution and hypothesis - Consider using
--type=randomizedfor faster approximate searchlights - Set appropriate memory limits with
options(future.globals.maxSize)
Summary
MVPA_Searchlight.R provides a flexible searchlight-based MVPA tool that works with both volumetric and surface-based data. It includes cross-validation, feature selection, and extensive configuration through command line or config files. The tool generates comprehensive metrics and reproducible outputs to help you analyze your neuroimaging data.
Next Steps: - Try different models
(--model=rf, --model=sda_notune) - Experiment
with feature selection methods - Explore surface-based MVPA with
--data_mode=surface - Use cross-validation strategies
appropriate for your design - Optimize performance with parallel
processing
Happy searchlighting!