Skip to contents

This helper function creates several blocks of multivariate data that share a common set of latent factor scores. Optionally the variables can be placed on the unit sphere to yield spatial coordinates and a sparse k-nearest-neighbour graph. The returned object also contains the ground-truth loadings and scores used for simulation.

Usage

synthetic_multiblock(
  S = 5,
  n = 100,
  p = 200,
  r = 3,
  sigma = 0.1,
  sphere = FALSE,
  k_nn = 6,
  seed = 1
)

Arguments

S

Number of subjects/blocks to generate (default 5).

n

Number of observations (rows) per block (default 100).

p

Number of variables (columns) per block, or a vector of length S specifying different dimensions per block (default 200).

r

The rank of the shared component structure (default 3).

sigma

The standard deviation of the noise added to the data (default 0.1).

sphere

Logical; if TRUE, variables are placed on the unit sphere and a k-nearest-neighbour graph is computed (default FALSE).

k_nn

Number of nearest neighbors for spatial correlations when `sphere = TRUE` (default 6).

seed

Random seed for reproducibility (default 1).

Value

A list containing:

data_list

A list of data matrices, one per block.

coords_list

Coordinates on the unit sphere (if `sphere = TRUE`), otherwise NULL.

V_true

List of ground-truth loading matrices.

F_true

Ground-truth factor score matrix.

Sadj

Spatial adjacency Laplacian matrix (if `sphere = TRUE`), otherwise NULL.