ClusteredNeuroVec: Parcel-based 4D Analysis
Source:vignettes/clustered-neurovec.Rmd
clustered-neurovec.Rmd
Overview
ClusteredNeuroVec
provides an efficient representation
for parcellated 4D neuroimaging data where voxels are grouped into
clusters or parcels. Instead of storing a time-series for every voxel,
it stores one time-series per cluster, making it ideal for analyses
using brain atlases like Schaefer-Yeo parcellations.
Why ClusteredNeuroVec?
Traditional neuroimaging analyses often involve: - Reducing voxel-level data to parcel/ROI averages - Working with brain atlases that group voxels into regions - Performing searchlight analyses at the parcel level rather than voxel level
ClusteredNeuroVec
makes these workflows more efficient
while maintaining compatibility with standard NeuroVec
operations.
Creating a ClusteredNeuroVec
From scratch with synthetic data
# Create a simple 3D space with mask
space <- NeuroSpace(c(10, 10, 10), spacing = c(2, 2, 2))
mask_data <- array(TRUE, c(10, 10, 10))
mask_data[1:3, 1:3, 1:3] <- FALSE # exclude corner
mask <- LogicalNeuroVol(mask_data, space)
# Create cluster assignments (e.g., 5 random clusters)
n_masked <- sum(mask_data)
cluster_ids <- sample(1:5, n_masked, replace = TRUE)
cvol <- ClusteredNeuroVol(mask, cluster_ids)
# Create synthetic 4D data
vec_space <- NeuroSpace(c(10, 10, 10, 20), spacing = c(2, 2, 2))
vec_data <- array(rnorm(10 * 10 * 10 * 20), dim = c(10, 10, 10, 20))
vec <- NeuroVec(vec_data, vec_space)
# Create ClusteredNeuroVec
cv <- ClusteredNeuroVec(vec, cvol)
print(cv)
#>
#> ClusteredNeuroVec
#>
#> += Spatial Info ---------------------------
#> | Dimensions : 10 x 10 x 10
#> | Spacing : 2 x 2 x 2
#> | Origin : 0 x 0 x 0
#> | Orientation : Left-to-Right Posterior-to-Anterior Inferior-to-Superior
#>
#> += Clustering Info ------------------------
#> | Clusters : 5
#> | Cluster Sizes : min=180, med=199, max=202
#>
#> += Temporal Info --------------------------
#> | Time Points : 20
#>
#> += Memory Info ----------------------------
#> | Matrix Size : 0 bytes MB (20 x 5)
#> | Active Voxels : 973
Key properties
# Dimensions: still 4D (x, y, z, time)
dim(cv)
#> [1] 10 10 10 20
# Number of clusters
num_clusters(cv)
#> [1] 5
# Access cluster time-series matrix (T x K)
ts_matrix <- as.matrix(cv, by = "cluster")
dim(ts_matrix) # 20 time points x 5 clusters
#> [1] 20 5
Cluster searchlight analysis
Perform searchlight analysis at the cluster level using centroid distances:
# K-nearest neighbor searchlight (10 nearest clusters)
windows_knn <- cluster_searchlight_series(cv, k = 10)
length(windows_knn) # One window per cluster
#> [1] 5
# Look at first window
win1 <- windows_knn[[1]]
dim(values(win1)) # 10 neighbors x 20 time points
#> [1] 20 5
# Radius-based searchlight (e.g., 15mm radius)
windows_radius <- cluster_searchlight_series(cv, radius = 15)
Real-world example: Schaefer parcellation
# Load fMRI data
fmri_data <- read_vec("subject01_task.nii.gz")
# Load Schaefer atlas (example with 400 parcels)
atlas <- read_vol("Schaefer2018_400Parcels_7Networks.nii.gz")
mask <- atlas > 0
# Create ClusteredNeuroVol from atlas
cvol <- ClusteredNeuroVol(mask, as.integer(atlas[mask]))
# Create parcellated representation
cv <- ClusteredNeuroVec(fmri_data, cvol)
# Now you have 400 time-series (one per parcel) instead of ~200,000 voxels
parcels <- as.matrix(cv, by = "cluster")
dim(parcels) # T x 400
# Perform connectivity analysis at parcel level
cor_matrix <- cor(parcels)
dim(cor_matrix) # 400 x 400
Integration with existing workflows
ClusteredNeuroVec
integrates seamlessly with existing
neuroim2 functions:
# Use with split_reduce for custom aggregation
# (ClusteredNeuroVec already uses this internally)
# Scale time-series within each cluster
# (if scale_series is implemented for ClusteredNeuroVec)
# cv_scaled <- scale_series(cv, center = TRUE, scale = TRUE)
# Get cluster centroids for visualization
centers <- centroids(cv)
head(centers) # x, y, z coordinates
#> [,1] [,2] [,3]
#> [1,] 5.783920 5.356784 5.522613
#> [2,] 5.495050 5.544554 5.579208
#> [3,] 5.406250 5.588542 5.604167
#> [4,] 5.683333 5.600000 5.638889
#> [5,] 5.620000 5.895000 5.645000
Performance benefits
By storing only K time-series instead of N voxels: - Memory usage: O(K × T) instead of O(N × T) - Searchlight operations: O(K²) instead of O(N²) - Typical reduction: 100-1000x fewer time-series
For a typical fMRI dataset: - Voxel-level: ~200,000 voxels × 500 timepoints = 100M values - Parcel-level: 400 parcels × 500 timepoints = 200K values
Summary
ClusteredNeuroVec
provides: - Efficient storage for
parcellated 4D data - Full array-like access semantics - Cluster-aware
searchlight operations - Seamless integration with existing neuroim2
workflows
It’s ideal for: - Atlas-based analyses - Connectivity studies - Parcellated machine learning - Any workflow that aggregates voxels to regions