MVPA Regional Analysis Tutorial
Your Name
2025-02-13
MVPA_RegionalCmdline.Rmd
Introduction
This tutorial explains how to run regional multivariate pattern analysis (MVPA) using MVPA_Regional.R. The script performs MVPA on specified brain regions, enabling both classification and regression analyses on fMRI data. Regional analysis can be conducted on volumetric (NIfTI) or surface-based neuroimaging data, and allows for separate training and testing subsets.
Key Features:
- Flexible Input Handling: Works with both volumetric (NIfTI) and surface-based data
- Subset Analysis: Supports separate training and testing subsets for region-specific evaluation
- Configurable via Files: Parameters can be set using YAML or R configuration files
- Detailed Outputs: Generates performance maps, prediction tables, and configuration files
- Robust Cross-Validation: Includes options for blocked, k-fold, and two-fold cross-validation
- Model Versatility: Compatible with built-in MVPA models and models available in the caret package
- Data Normalization & Feature Selection: Optional centering/scaling and customizable feature selection methods
Running the Script
1. Basic Usage
If you have:
- A 4D fMRI file for training (e.g.,
train_data.nii
) - A trial-by-trial design matrix (e.g.,
train_design.txt
) - A brain mask file (e.g.,
mask.nii
)
You can run the regional analysis from the command line:
2. Understanding Data Modes
The script supports two primary data modes:
3. Available Models
The script supports various classification and regression models:
Built-in MVPA Models:
-
corclass
: Correlation-based classifier with template matching -
sda_notune
: Shrinkage Discriminant Analysis without tuning -
sda_boot
: SDA with bootstrap resampling -
glmnet_opt
: Elastic net with EPSGO parameter optimization -
sparse_sda
: SDA with sparsity constraints -
sda_ranking
: SDA with automatic feature ranking -
mgsda
: Multi-Group Sparse Discriminant Analysis -
lda_thomaz
: Modified LDA for high-dimensional data -
hdrda
: High-Dimensional Regularized Discriminant Analysis
4. Cross-Validation Options
Multiple cross-validation strategies are available:
Advanced Cross-Validation Methods
In addition to the standard options above, several advanced cross-validation strategies are available:
- Blocked Cross-Validation: Divides the dataset based on a blocking variable (e.g., session) so that samples from the same block remain together.
- K-Fold Cross-Validation: Randomly partitions the data into k folds, providing a robust estimate of model performance.
- Bootstrap Blocked Cross-Validation: Generates bootstrap resamples within blocks to assess model stability in heterogeneous datasets.
- Sequential Blocked Cross-Validation: Assigns sequential folds within each block, preserving temporal or ordered structures.
- Custom Cross-Validation: Allows you to define custom training and testing splits if standard methods do not fit your experimental design.
Specify the desired method in your configuration file by setting the
name
field under cross_validation
. For
example, to use bootstrap blocked cross-validation:
cross_validation:
name: "bootstrap" # Options: "twofold", "bootstrap", "sequential", "custom", "kfold"
nreps: 10
Choose the method that best aligns with your data structure and experimental design.
6. Understanding label_column
The label column specifies the target variable:
- For classification, it should contain categorical labels (e.g., “Face”, “House”).
- For regression, it should contain continuous values (e.g., reaction times).
Example Design File
(train_design.txt
):
trial condition subject session
1 Face S01 1
2 House S01 1
3 Face S01 1
4 House S01 1
5 Face S01 2
7. Using a Configuration File
Instead of specifying all options on the command line, you can use a configuration file.
Example YAML Config File
(regional_config.yaml
):
# Data Sources
train_design: "train_design.txt"
test_design: "test_design.txt"
train_data: "train_data.nii"
test_data: "test_data.nii"
mask: "mask.nii"
# Analysis Parameters
model: "rf" # Random Forest classifier
data_mode: "image" # or "surface"
ncores: 4
label_column: "condition"
block_column: "session"
# Output Options
output: "regional_results"
normalize_samples: TRUE
class_metrics: TRUE
# Advanced Options
feature_selector:
method: "anova"
cutoff_type: "percentile"
cutoff_value: 0.1
cross_validation:
name: "twofold"
nreps: 10
# Optional Subsetting: Define different subsets for training and testing
train_subset: "subject == 'S01'"
test_subset: "subject == 'S02'"
Running with a Config File:
8. Expected Outputs
After running the script, the output directory (e.g.,
regional_results/
) contains:
- Performance Maps: NIfTI files with region-level performance metrics (e.g., accuracy, AUC).
- Prediction Tables: Text files summarizing predictions for each region.
-
Configuration File:
config.yaml
with complete analysis parameters for reproducibility.
Example directory structure:
regional_results/
├── performance_table.txt # Regional performance metrics
├── prediction_table.txt # Prediction details per region
├── regional_metric1.nii # Regional performance map (e.g., accuracy or AUC)
├── regional_metric2.nii # Additional metric maps (if applicable)
└── config.yaml # Analysis configuration
For regression analyses, different metrics (e.g.,
r2.nii
, rmse.nii
, spearcor.nii
)
will be output.
Summary
- MVPA_Regional.R is a versatile tool for regional MVPA analysis.
- Supports both volumetric and surface-based data.
- Flexible configuration via command line or config files.
- Outputs detailed performance maps and prediction tables.
- Robust cross-validation and optional feature selection enhance analysis.
Next Steps: - Experiment with various models
(--model=rf
, --model=sda_notune
). - Test
different feature selection methods. - Evaluate both classification and
regression scenarios. - Optimize processing using parallel
computation.
Happy regional analysis!