Create a Feature-Based RSA Model
feature_rsa_model.Rd
Creates a model for feature-based Representational Similarity Analysis (RSA) that relates neural patterns (X) to a predefined feature space (F).
Arguments
- dataset
An
mvpa_dataset
object containing the neural data (X
).- design
A
feature_rsa_design
object specifying the feature space (F
) and including the component limit (`max_comps`).- method
Character string specifying the analysis method. One of:
- pls
Partial Least Squares regression predicting X from F.
- pca
Principal Component Analysis on F, followed by regression predicting X from the PCs.
- glmnet
Elastic net regression predicting X from F using glmnet with multivariate Gaussian response.
- crossval
Optional cross-validation specification.
- cache_pca
Logical, if TRUE and method is "pca", cache the PCA decomposition of the feature matrix F across cross-validation folds involving the same training rows. Defaults to FALSE.
- alpha
Numeric value between 0 and 1, only used when method="glmnet". Controls the elastic net mixing parameter: 1 for lasso (default), 0 for ridge, values in between for a mixture. Defaults to 0.5 (equal mix of ridge and lasso).
- cv_glmnet
Logical, if TRUE and method="glmnet", use cv.glmnet to automatically select the optimal lambda value via cross-validation. Defaults to FALSE.
- lambda
Optional numeric value or sequence of values, only used when method="glmnet" and cv_glmnet=FALSE. Specifies the regularization parameter. If NULL (default), a sequence will be automatically determined by glmnet.
- nperm
Integer, number of permutations to run for statistical testing of model performance metrics after merging cross-validation folds. Default 0 (no permutation testing).
- permute_by
DEPRECATED. Permutation is always done by shuffling rows of the predicted matrix.
- save_distributions
Logical, if TRUE and nperm > 0, save the full null distributions from the permutation test. Defaults to FALSE.
- ...
Additional arguments (currently unused).
Details
Feature RSA models analyze how well a feature matrix F
(defined in the `design`)
relates to neural data X
. The `max_comps` parameter, inherited from the `design` object,
sets an upper limit on the number of components used:
- pca: Performs PCA on F
. `max_comps` limits the number of principal components
(selected by variance explained) used to predict X
. Actual components used: `min(max_comps, available_PCs)`.
- pls: Performs PLS regression predicting X
from F
. `max_comps` sets the
maximum number of PLS components to compute. Actual components used may be fewer based on the PLS algorithm.
- glmnet: Performs elastic net regression predicting X
from F
using the glmnet package
with multivariate Gaussian response family. The regularization (lambda) can be automatically selected via cross-validation
if cv_glmnet=TRUE. The alpha parameter controls the balance between L1 (lasso) and L2 (ridge) regularization.
**Performance Metrics** (computed by `evaluate_model` after cross-validation): - `mean_correlation`: Average correlation between predicted and observed patterns for corresponding trials/conditions (diagonal of the prediction-observation correlation matrix). - `cor_difference`: The `mean_correlation` minus the average off-diagonal correlation (`mean_correlation` - `off_diag_correlation`). Measures how much better the model predicts the correct trial/condition compared to incorrect ones. - `mean_rank_percentile`: Average percentile rank of the diagonal correlations. For each condition, ranks how well the model's prediction correlates with the correct observed pattern compared to incorrect patterns. Values range from 0 to 1, with 0.5 expected by chance and 1 indicating perfect discrimination. - `voxel_correlation`: Correlation between the vectorized predicted and observed data matrices across all trials and voxels. - `mse`: Mean Squared Error between predicted and observed values. - `r_squared`: Proportion of variance in the observed data explained by the predicted data. - `p_*`, `z_*`: If `nperm > 0`, permutation-based p-values and z-scores for the above metrics, assessing significance against a null distribution generated by shuffling predicted trial labels.
The number of components actually used (`ncomp`) for the region/searchlight is also included in the performance output.