MVPA Regional Analysis Tutorial
Your Name
2025-03-20
MVPA_RegionalCmdline.Rmd
Introduction
This tutorial explains how to run regional multivariate pattern analysis (MVPA) using MVPA_Regional.R. The script performs MVPA on specified brain regions, enabling both classification and regression analyses on fMRI data. Regional analysis can be conducted on volumetric (NIfTI) or surface-based neuroimaging data, and allows for separate training and testing subsets.
Key Features
The script handles both volumetric NIfTI and surface-based data formats. You can evaluate specific regions by using separate training and testing subsets. All parameters are configurable through YAML or R files.
The analysis produces comprehensive outputs including performance maps, prediction tables, and configuration records. Cross-validation options include blocked, k-fold, and two-fold approaches.
The script works with built-in MVPA models and integrates with the caret package for additional model options. Data preprocessing includes optional centering/scaling and flexible feature selection methods.
Running the Script
1. Basic Usage
If you have:
- A 4D fMRI file for training (e.g.,
train_data.nii
) - A trial-by-trial design matrix (e.g.,
train_design.txt
) - A brain mask file (e.g.,
mask.nii
)
You can run the regional analysis from the command line:
2. Understanding Data Modes
The script supports two primary data modes:
3. Available Models
The script supports various classification and regression models:
Built-in MVPA Models:
-
corclass
: Correlation-based classifier with template matching -
sda_notune
: Shrinkage Discriminant Analysis without tuning -
sda_boot
: SDA with bootstrap resampling -
glmnet_opt
: Elastic net with EPSGO parameter optimization -
sparse_sda
: SDA with sparsity constraints -
sda_ranking
: SDA with automatic feature ranking -
mgsda
: Multi-Group Sparse Discriminant Analysis -
lda_thomaz
: Modified LDA for high-dimensional data -
hdrda
: High-Dimensional Regularized Discriminant Analysis
4. Cross-Validation Options
Multiple cross-validation strategies are available:
Advanced Cross-Validation Methods
In addition to the standard options above, several advanced cross-validation strategies are available:
- Blocked Cross-Validation: Divides the dataset based on a blocking variable (e.g., session) so that samples from the same block remain together.
- K-Fold Cross-Validation: Randomly partitions the data into k folds, providing a robust estimate of model performance.
- Bootstrap Blocked Cross-Validation: Generates bootstrap resamples within blocks to assess model stability in heterogeneous datasets.
- Sequential Blocked Cross-Validation: Assigns sequential folds within each block, preserving temporal or ordered structures.
- Custom Cross-Validation: Allows you to define custom training and testing splits if standard methods do not fit your experimental design.
Specify the desired method in your configuration file by setting the
name
field under cross_validation
. For
example, to use bootstrap blocked cross-validation:
cross_validation:
name: "bootstrap" # Options: "twofold", "bootstrap", "sequential", "custom", "kfold"
nreps: 10
Choose the method that best aligns with your data structure and experimental design.
6. Understanding label_column
The label column specifies the target variable:
- For classification, it should contain categorical labels (e.g., “Face”, “House”).
- For regression, it should contain continuous values (e.g., reaction times).
Example Design File
(train_design.txt
):
trial condition subject session
1 Face S01 1
2 House S01 1
3 Face S01 1
4 House S01 1
5 Face S01 2
7. Using a Configuration File
Instead of specifying all options on the command line, you can use a configuration file.
Example YAML Config File
(regional_config.yaml
):
# Data Sources
train_design: "train_design.txt"
test_design: "test_design.txt"
train_data: "train_data.nii"
test_data: "test_data.nii"
mask: "mask.nii"
# Analysis Parameters
model: "rf" # Random Forest classifier
data_mode: "image" # or "surface"
ncores: 4
label_column: "condition"
block_column: "session"
# Output Options
output: "regional_results"
normalize_samples: TRUE
class_metrics: TRUE
# Advanced Options
feature_selector:
method: "anova"
cutoff_type: "percentile"
cutoff_value: 0.1
cross_validation:
name: "twofold"
nreps: 10
# Optional Subsetting: Define different subsets for training and testing
train_subset: "subject == 'S01'"
test_subset: "subject == 'S02'"
Running with a Config File:
8. Expected Outputs
After running the script, the output directory (e.g.,
regional_results/
) contains:
- Performance Maps: NIfTI files with region-level performance metrics (e.g., accuracy, AUC).
- Prediction Tables: Text files summarizing predictions for each region.
-
Configuration File:
config.yaml
with complete analysis parameters for reproducibility.
Example directory structure:
regional_results/
├── performance_table.txt # Regional performance metrics
├── prediction_table.txt # Prediction details per region
├── regional_metric1.nii # Regional performance map (e.g., accuracy or AUC)
├── regional_metric2.nii # Additional metric maps (if applicable)
└── config.yaml # Analysis configuration
For regression analyses, different metrics (e.g.,
r2.nii
, rmse.nii
, spearcor.nii
)
will be output.
Summary
MVPA_Regional.R provides comprehensive regional MVPA analysis capabilities. It handles both volumetric and surface-based data formats with flexible configuration through command line or config files. The tool generates detailed performance maps and prediction tables, while incorporating robust cross-validation and feature selection to ensure reliable results.
Next Steps: - Experiment with various models
(--model=rf
, --model=sda_notune
). - Test
different feature selection methods. - Evaluate both classification and
regression scenarios. - Optimize processing using parallel
computation.
Happy regional analysis!