Skip to contents

Anchored MFA generalizes Multiple Factor Analysis (MFA) to the case where a single reference block `Y` (with `N` rows) is linked to multiple blocks `X_k` that may have different numbers of rows. Each row of `X_k` is mapped to a row of `Y` via an index vector `row_index[[k]]`. The model estimates a shared score matrix `S` for the rows of `Y` and block-specific loading matrices for `Y` and each `X_k`.

Optionally, a feature prior can be provided to encourage corresponding or similar features across different `X_k` blocks to have similar loading vectors.

Usage

anchored_mfa(
  Y,
  X,
  row_index,
  preproc = multivarious::center(),
  ncomp = 2,
  normalization = c("MFA", "None", "custom"),
  alpha = NULL,
  score_constraint = c("none", "orthonormal"),
  feature_groups = NULL,
  feature_lambda = 0,
  max_iter = 50,
  tol = 1e-06,
  ridge = 1e-08,
  verbose = FALSE,
  use_future = FALSE,
  ...
)

linked_mfa(
  Y,
  X,
  row_index,
  preproc = multivarious::center(),
  ncomp = 2,
  normalization = c("MFA", "None", "custom"),
  alpha = NULL,
  score_constraint = c("none", "orthonormal"),
  feature_groups = NULL,
  feature_lambda = 0,
  max_iter = 50,
  tol = 1e-06,
  ridge = 1e-08,
  verbose = FALSE,
  use_future = FALSE,
  ...
)

Arguments

Y

Numeric matrix/data.frame (`N × q`) serving as the reference block.

X

A list of numeric matrices/data.frames. Each element `X[[k]]` is `n_k × p_k`.

row_index

A list of integer vectors. `row_index[[k]]` has length `n_k` and maps rows of `X[[k]]` to rows of `Y` (values in `1..N`).

preproc

A `multivarious` preprocessing pipeline (a `pre_processor`/`prepper`) or a list of them. If a list, it must have length `1 + length(X)` and will be applied to `c(list(Y), X)` in that order.

ncomp

Integer number of components to extract.

normalization

Block weighting scheme. `"MFA"` uses inverse squared first singular value per block; `"None"` uses uniform weights; `"custom"` uses `alpha`.

alpha

Optional numeric vector of per-block weights (length `1 + length(X)`), used when `normalization = "custom"`. The first weight corresponds to `Y`.

score_constraint

Identification strategy for the shared score matrix. `"none"` uses the historical unconstrained update followed by QR normalization inside each ALS iteration. `"orthonormal"` treats `S transpose S = I` as part of the model and updates `S` with a constrained majorization/polar step.

feature_groups

Feature prior specification. One of: * `NULL` (no feature prior), * `"colnames"` to group X-features with identical column names across blocks, * a `data.frame` with columns `block`, `feature`, `group` and optional `weight`. `block` refers to a name or index in `X` (not including `Y`), and `feature` is a column name or index within that block.

feature_lambda

Non-negative scalar controlling strength of the feature prior.

max_iter

Maximum number of alternating least-squares iterations.

tol

Relative tolerance on the objective for convergence.

ridge

Non-negative ridge stabilization added to normal equations.

verbose

Logical; if `TRUE`, prints iteration progress.

use_future

Logical; if `TRUE`, block-wise computations that do not depend on one another (initial block-local preprocessing helpers and the final partial-scores assembly) are performed via `furrr::future_map()` when available. The main alternating-least-squares loop is intrinsically sequential and is unaffected. Accepted here primarily for interface parity with [anchored_mcca()].

...

Unused (reserved for future extensions).

Value

An object inheriting from `multivarious::multiblock_biprojector` with additional classes `"anchored_mfa"` and `"linked_mfa"`. The object contains global anchor scores in `s` (and alias `S`), concatenated loadings in `v`, and block mappings in `block_indices`. Additional fields include `V_list`, `B`, `row_index`, exact per-block mapped scores in `Z_list`, `score_index`, `alpha_blocks`, and `objective_trace`.

Details

## Model The fitted model has the form: $$Y \approx S B^\top$$ $$X_k \approx S[\mathrm{idx}_k,] V_k^\top$$ where `S` is `N × ncomp`, `B` is `q × ncomp`, and each `V_k` is `p_k × ncomp`. The score matrix can be identified either with the historical unconstrained/QR update (`score_constraint = "none"`) or with an explicit orthonormal constraint (`score_constraint = "orthonormal"`).

## Feature similarity prior (v1) When `feature_lambda > 0` and `feature_groups` is supplied, Anchored MFA applies a group-shrinkage penalty that pulls the loading vectors of features in the same group toward a shared group center.

`linked_mfa()` is a legacy alias for [anchored_mfa()] retained for backward compatibility.

Examples

# \donttest{
set.seed(1)
N <- 30
Y <- matrix(rnorm(N * 5), N, 5)
X1 <- matrix(rnorm(20 * 10), 20, 10)
X2 <- matrix(rnorm(15 * 8), 15, 8)
idx1 <- sample.int(N, nrow(X1), replace = FALSE)
idx2 <- sample.int(N, nrow(X2), replace = FALSE)

fit <- anchored_mfa(Y, list(X1 = X1, X2 = X2), list(X1 = idx1, X2 = idx2), ncomp = 2)
#> Applying the same preprocessor definition independently to each block.
stopifnot(nrow(multivarious::scores(fit)) == N)
# }