CPCA filtering in DKGE provides a principled framework for decomposing task-related variance into interpretable subcomponents before fitting latent bases. When your experimental design includes multiple types of effects—such as main effects versus interactions, experimental conditions versus control baselines, or different cognitive domains—CPCA allows you to analyze these effect types separately while maintaining their mathematical relationships.
The method operates directly in the effect space defined by the design kernel , ensuring that the resulting bases remain -orthogonal and preserve the interpretability of your experimental design structure. This vignette demonstrates what CPCA filtering accomplishes, provides guidance on when it proves most valuable, and walks through practical implementation strategies.
When and Why to Use CPCA Filtering
CPCA filtering becomes valuable when your experimental design naturally contains multiple types of task effects that warrant separate analysis. Rather than analyzing all effects together in a single latent space, CPCA allows you to focus your analysis on specific effect types while cleanly separating others.
Common scenarios where CPCA proves beneficial:
Factorial designs: Separate main effects from interaction terms to understand how basic experimental manipulations differ from their combined effects. For example, in a 2×2 design studying attention and working memory, you might isolate the main effects of each factor from their interaction.
Experimental versus control conditions: Focus analysis on your experimental manipulations while factoring out baseline or control conditions. This approach can reveal cleaner patterns in your conditions of interest.
Multi-domain studies: When studying different cognitive processes within the same experiment, separate effects related to different domains (e.g., working memory versus attention) to understand domain-specific versus shared neural mechanisms.
Planned versus exploratory contrasts: Isolate your primary hypotheses from exploratory or secondary analyses, ensuring that your main effects of interest receive focused statistical attention.
How CPCA Filtering Works
The compressed covariance that DKGE analyzes contains variance from all experimental effects mixed together. CPCA filtering mathematically separates this total variance into distinct subcomponents before eigendecomposition. The process involves several key steps that preserve the mathematical relationships defined by your design kernel .
First, DKGE constructs a projector onto your chosen effect subspace
using the
metric through dkge_projector_K(). This projector respects
the similarity relationships between experimental effects that are
encoded in your design kernel. Next, dkge_cpca_split_chat()
applies this projector to split the compressed covariance into design
and residual components. Finally, DKGE fits separate bases for the
components you request while preserving
-orthogonality
between the returned bases, ensuring they can be analyzed jointly.
You can specify the design-aligned subspace either by naming specific
effects through cpca_blocks or by providing an explicit
basis matrix via cpca_T. The cpca_part
argument determines which filtered components are returned
("design", "resid", or "both"),
while cpca_ridge optionally adds regularization to
stabilize small eigenvalues during decomposition.
Simulated Experiment: Attention-Working Memory Study
To demonstrate CPCA filtering with realistic neuroimaging scenarios, we simulate data from a factorial attention-working memory experiment. Our design includes two main effects (attention cue validity and working memory load), their interaction, plus additional control conditions. This structure naturally lends itself to CPCA analysis, where we might want to isolate the main experimental effects from their interaction and control conditions.
The simulated dataset contains strong signals in the primary experimental manipulations (attention and working memory main effects) and weaker but meaningful variance in the interaction term and control conditions. This mirrors real neuroimaging studies where primary effects of interest typically show stronger and more consistent patterns than secondary effects.
S <- 8
q <- 6
P <- 16
Tlen <- 80
effects <- c("attn_valid", "attn_invalid", "wmem_high", "wmem_low", "interact", "control")
betas <- replicate(S, {
# Strong signals for main experimental effects
main_effects <- matrix(rnorm(2 * P, sd = 1.5), 2, P)
# Weaker signals for interaction and control conditions
secondary_effects <- matrix(rnorm((q - 2) * P, sd = 0.4), q - 2, P)
mat <- rbind(main_effects, secondary_effects)
rownames(mat) <- effects
mat
}, simplify = FALSE)
designs <- replicate(S, {
X <- matrix(rnorm(Tlen * q), Tlen, q)
X <- qr.Q(qr(X))
colnames(X) <- effects
X
}, simplify = FALSE)
subjects <- Map(function(b, X, id) dkge_subject(b, X, id = id),
betas, designs, paste0("sub", seq_len(S)))
bundle <- dkge_data(subjects)When we fit the standard DKGE model without CPCA filtering, all task-related variance is compressed into a single latent basis. The leading eigenvalues reflect the combined influence of both our primary experimental effects and secondary conditions, making it difficult to isolate the specific patterns we want to study.
fit_plain <- dkge(bundle, K = diag(q), rank = 3)
round(fit_plain$evals[1:4], 3)
#> [1] 2507 2107 209 180These eigenvalues represent the mixed signal from all experimental conditions. While this standard approach captures the dominant patterns in the data, it doesn’t allow us to focus specifically on our primary experimental manipulations versus their interactions and control conditions.
Isolating Primary Experimental Effects
Now we apply CPCA filtering to separate our primary experimental
effects (attention and working memory main effects) from the secondary
effects (interaction and control conditions). We achieve this by
identifying the first two effects in our design as the “design-aligned”
subspace through cpca_blocks = 1:2.
Setting cpca_part = "both" instructs DKGE to return both
the design-aligned basis (focused on our primary effects) and the
residual basis (containing the secondary effects). This dual analysis
allows us to examine both effect types while maintaining their
mathematical independence.
fit_cpca <- dkge(bundle,
K = diag(q),
cpca_blocks = 1:2,
cpca_part = "both",
rank = 3)
#> Warning: Requested rank 3 exceeds effective rank 2. Reducing to 2 components.
fit_cpca$cpca$part
#> [1] "both"
round(fit_cpca$cpca$evals_design[1:3], 3)
#> [1] 2503 2095 0
round(fit_cpca$cpca$evals_resid[1:3], 3)
#> [1] 213 182 157Notice how CPCA filtering has cleanly separated the variance components. The design eigenvalues now reflect only the primary experimental effects we specified, while the residual eigenvalues capture the secondary effects including interactions and control conditions. This separation allows for focused analysis of each effect type.
The mathematical beauty of this approach lies in the preservation of -orthogonality between the design and residual bases. This orthogonality ensures that the two sets of components are mathematically independent in the design kernel metric, preventing any contamination between primary and secondary effect patterns:
Ud <- fit_cpca$cpca$U_design
Ur <- fit_cpca$cpca$U_resid
round(max(abs(t(Ud) %*% fit_cpca$K %*% Ur)), 6)
#> [1] 0Influence of the Design Kernel
The split is metric-aware: changing kernel alters which
directions count as “design-aligned.” A smooth kernel diffuses the
projector across neighbouring rows, so design energy leaks into adjacent
effects.
K_smooth <- outer(seq_len(q), seq_len(q), function(i, j) 0.7^abs(i - j))
fit_kernel <- dkge(bundle,
K = K_smooth,
cpca_blocks = 1:2,
cpca_part = "both",
rank = 3)
round(fit_kernel$cpca$evals_design[1:3], 3)
#> [1] 3273 819 0
round(fit_kernel$cpca$evals_resid[1:3], 3)
#> [1] 309.0 97.9 47.6Compared with the identity kernel, the smoother kernel draws more
variance into the design-aligned slice and slightly spreads the
corresponding loadings across adjacent effects. The projector honours
the correlation structure encoded by K_smooth, so your
choice of kernel directly shapes which latent directions are considered
design-driven. When effects do not align with coordinate axes, pass a
custom cpca_T that expresses the intended K-weighted span
explicitly.
Behind the scenes, DKGE accomplishes this separation through
mathematical projectors that operate in the design kernel metric. The
function dkge_cpca_split_chat() applies these projectors to
split the compressed covariance matrix before eigendecomposition. You
can examine this split directly to understand how the total variance is
partitioned:
T_design <- diag(1, q)[, 1:2]
split_plain <- dkge_cpca_split_chat(fit_plain$Chat, T_design, fit_plain$K)
round(diag(split_plain$Chat_design), 3)
#> [1] 2301 2297 0 0 0 0
round(diag(split_plain$Chat_resid), 3)
#> [1] 0 0 145 209 158 175The diagonal elements show how variance is distributed between design
and residual components. Notice that when
cpca_part = "design" or "both", the fitted
model’s compressed covariance matrix (fit_cpca$Chat) equals
the design-filtered component, confirming that the analysis focuses
specifically on your chosen effects:
Using Custom Effect Combinations
Sometimes your effects of interest don’t correspond to simple subsets
of experimental conditions. In such cases, you can provide a custom
basis matrix through cpca_T to specify exactly which
combinations of effects should be treated as “design-aligned.”
The columns of your custom basis matrix span the linear combinations of effects you want to analyze together. For example, you might want to combine specific experimental conditions or weight certain effects more heavily than others. Here we demonstrate by creating a custom basis that combines multiple conditions with differential weighting:
T_custom <- qr.Q(qr(cbind(c(1, 1, 0, 0, 0, 0),
c(0, 0, 2, 1, 0, 0))))
fit_custom <- dkge(bundle,
K = diag(q),
cpca_T = T_custom,
cpca_part = "design",
rank = 2)
round(fit_custom$cpca$evals_design[1:2], 3)
#> [1] 2095 150This custom basis creates two design components: the first combines the attention conditions equally, while the second emphasizes the working memory conditions with higher weight on the high-load condition. The fitted loadings reflect these chosen combinations, while the residual component is omitted since we requested only the design-aligned analysis.
Numerical Stabilization with Ridge Regularization
When working with real neuroimaging data, you may encounter
situations where the filtered covariance matrix becomes nearly
rank-deficient, leading to numerical instability during
eigendecomposition. The optional cpca_ridge parameter
addresses this issue by adding a small diagonal ridge term before
eigendecomposition, improving numerical stability without substantially
altering the results.
fit_ridge <- dkge(bundle,
K = diag(q),
cpca_blocks = 1:2,
cpca_part = "design",
cpca_ridge = 1e-3,
rank = 3)
diag_shift <- diag(fit_ridge$cpca$Chat_design - fit_ridge$cpca$Chat_design_raw)
round(head(diag_shift), 6)
#> [1] 0.001 0.001 0.001 0.001 0.001 0.001The ridge regularization adds the specified value to each diagonal
element of the covariance matrix, as shown by the consistent shift
across diagonal entries. The original unregularized matrix remains
available in Chat_design_raw for comparison and diagnostic
purposes.
Simplified Interface for CPCA Analysis
For workflows that focus exclusively on CPCA filtering, the
dkge_cpca_fit() function provides a streamlined interface
that reduces code verbosity. This convenience wrapper handles the
CPCA-specific arguments while forwarding all other parameters to the
main dkge() function, making your analysis code more
concise and readable.
fit_wrapper <- dkge_cpca_fit(bundle,
K = diag(q),
cpca_blocks = 1:2,
cpca_part = "design",
rank = 3)
#> Warning: Requested rank 3 exceeds effective rank 2. Reducing to 2 components.
identical(round(fit_wrapper$U, 6), round(fit_cpca$U, 6))
#> [1] TRUESummary: Strategic Applications of CPCA Filtering
CPCA filtering proves most valuable when your research questions naturally call for decomposing task-related variance into distinct components. The method excels in several key scenarios:
Factorial experimental designs benefit from CPCA when you need to separate main effects from their interactions, allowing for cleaner interpretation of basic experimental manipulations versus their combined effects.
Multi-domain cognitive studies can use CPCA to isolate domain-specific effects (such as working memory versus attention) while maintaining mathematical independence between cognitive systems.
Hypothesis-driven analyses gain power when CPCA focuses the statistical analysis on planned contrasts while factoring out exploratory or control conditions that might dilute the signal of interest.
Comparative studies across datasets become more interpretable when CPCA ensures that the same types of effects are analyzed consistently, improving the reliability of cross-study comparisons.
The mathematical foundation of CPCA filtering ensures that this decomposition preserves interpretability while maintaining compatibility with all other DKGE tools. Whether you proceed with contrast testing, bootstrap inference, or visualization, the -orthogonal components can be analyzed using the full DKGE toolkit without modification.
By directing the eigendecomposition toward your specific research questions, CPCA filtering transforms a general-purpose dimension reduction into a targeted analytical tool that respects both your experimental design and your theoretical hypotheses.