MVPA Searchlight Tutorial
Your Name
2025-09-28
CommandLineScripts.Rmd
Introduction
This tutorial explains how to run a searchlight-based Multivariate Pattern Analysis (MVPA) using MVPA_Searchlight.R. The script performs a local classification or regression analysis on fMRI data by iterating over each voxel (or node for surface data) and extracting information from a surrounding neighborhood.
Key features
The script handles both volumetric (NIfTI) and surface data and can
parallelize across cores. You can select classifiers and regressors such
as rf
, sda_notune
, and corsim
,
enable feature selection, and choose cross‑validation schemes that
respect run structure. Outputs include performance and probability maps
together with a complete configuration file for reproducibility.
Optional normalization (centering/scaling) is available in both data
modes.
Running the Script
1. Basic Usage
If you have:
- A 4D fMRI file for training:
train_data.nii
- A trial-by-trial design matrix:
train_design.txt
- A brain mask file:
mask.nii
You can run the script from the command line:
2. Data modes
The script supports two primary data modes:
3. Models
The script supports various classification and regression models:
Built‑in MVPA models
-
corclass
: Correlation-based classifier with template matching -
sda_notune
: Simple Shrinkage Discriminant Analysis without tuning -
sda_boot
: SDA with bootstrap resampling -
glmnet_opt
: Elastic net with EPSGO parameter optimization -
sparse_sda
: SDA with sparsity constraints -
sda_ranking
: SDA with automatic feature ranking -
mgsda
: Multi-Group Sparse Discriminant Analysis -
lda_thomaz
: Modified LDA for high-dimensional data -
hdrda
: High-Dimensional Regularized Discriminant Analysis
You can also register custom models via
register_mvpa_model()
.
4. Cross‑validation options
The script supports multiple cross-validation strategies:
Advanced cross‑validation methods
Beyond standard blocked and k‑fold splits, you can use bootstrap
blocked CV (resampling within runs), sequential blocked CV (ordered
folds), or provide custom train/test indices. Specify the method in the
config file under cross_validation.name
. For example:
cross_validation:
name: "bootstrap" # Options: "twofold", "bootstrap", "sequential", "custom", "kfold"
nreps: 10
Choose the method that best matches your data structure and experimental design.
6. Understanding label_column
The label column is critical as it specifies the target variable for classification or regression.
- If performing classification, this column should
contain categorical labels (e.g.,
"Face"
vs."House"
). - If performing regression, this column should contain continuous values (e.g., reaction times, confidence ratings).
Example Design File
(train_design.txt
):
trial condition subject session
1 Face S01 1
2 House S01 1
3 Face S01 1
4 House S01 1
5 Face S01 2
7. Using a Configuration File
Instead of specifying all options on the command line, you can use a YAML or R script configuration file.
Example YAML Config File
(config.yaml
):
# Data Sources
train_design: "train_design.txt"
test_design: "test_design.txt"
train_data: "train_data.nii"
test_data: "test_data.nii"
mask: "mask.nii"
# Analysis Parameters
model: "rf" # Random Forest classifier
data_mode: "image" # or "surface"
ncores: 4
radius: 6
label_column: "condition"
block_column: "session"
# Output Options
output: "searchlight_results"
normalize_samples: TRUE
class_metrics: TRUE
# Advanced Options
feature_selector:
method: "anova"
cutoff_type: "percentile"
cutoff_value: 0.1
cross_validation:
name: "twofold"
nreps: 10
# Optional Subsetting
train_subset: "subject == 'S01'"
test_subset: "subject == 'S02'"
Running with a Config File:
8. Expected Outputs
After running the script, the output directory
(searchlight_results/
) contains:
-
Performance Maps: NIfTI files for each performance
metric
-
accuracy.nii
: Overall classification accuracy map -
auc.nii
: Area Under Curve (AUC) performance map - For multiclass problems with
class_metrics: TRUE
:-
auc_class1.nii
,auc_class2.nii
, etc.: Per-class AUC maps
-
-
-
Probability Maps: When available
-
prob_observed.nii
: Probabilities for observed classes -
prob_predicted.nii
: Probabilities for predicted classes
-
-
Configuration
-
config.yaml
: Complete record of analysis parameters for reproducibility
-
Example directory structure:
searchlight_results/
├── accuracy.nii # Overall classification accuracy
├── auc.nii # Mean AUC across classes
├── auc_class1.nii # AUC for class 1 (if class_metrics: TRUE)
├── auc_class2.nii # AUC for class 2 (if class_metrics: TRUE)
├── prob_observed.nii # Probabilities for observed classes
├── prob_predicted.nii # Probabilities for predicted classes
└── config.yaml # Analysis configuration
The exact files will depend on: - Whether it’s a binary or multiclass
classification - If class_metrics: TRUE
is set - The type
of analysis (classification vs regression) - The model type used
For regression analyses, you’ll see different metrics: -
r2.nii
: R-squared values - rmse.nii
: Root Mean
Square Error - spearcor.nii
: Spearman correlation
9. Performance Considerations
- Use
--normalize_samples=TRUE
for better model performance - Increase
--ncores
for faster processing on multi-core systems - Adjust
--radius
based on your spatial resolution and hypothesis - Consider using
--type=randomized
for faster approximate searchlights - Set appropriate memory limits with
options(future.globals.maxSize)
Summary
MVPA_Searchlight.R provides a flexible searchlight-based MVPA tool that works with both volumetric and surface-based data. It includes cross-validation, feature selection, and extensive configuration through command line or config files. The tool generates comprehensive metrics and reproducible outputs to help you analyze your neuroimaging data.
Next Steps: - Try different models
(--model=rf
, --model=sda_notune
) - Experiment
with feature selection methods - Explore surface-based MVPA with
--data_mode=surface
- Use cross-validation strategies
appropriate for your design - Optimize performance with parallel
processing
Happy searchlighting!