Skip to contents

Implements the algorithm of Allen *et al.* (2013) for supervised dimension-reduction with optional sparsity (\(\ell_1\)) or ridge (\(\ell_2\)) penalties **and** the generalised extension that operates in a user-supplied quadratic form \(Q\).

Usage

fit_rpls(
  X,
  Y,
  K = 2,
  lambda = 0.1,
  penalty = c("l1", "ridge"),
  Q = NULL,
  nonneg = FALSE,
  tol = 1e-06,
  maxiter = 200,
  verbose = FALSE
)

rpls(
  X,
  Y,
  K = 2,
  lambda = 0.1,
  penalty = c("l1", "ridge"),
  Q = NULL,
  nonneg = FALSE,
  preproc_x = multivarious::pass(),
  preproc_y = multivarious::pass(),
  tol = 1e-06,
  maxiter = 200,
  verbose = FALSE,
  ...
)

Arguments

X

Numeric matrix \((n \times p)\) — predictors.

Y

Numeric matrix \((n \times q)\) — responses.

K

Integer, number of latent factors to extract. Default `2`.

lambda

Scalar or length-K numeric vector of penalties.

penalty

Either `"l1"` (lasso) or `"ridge"`.

Q

Optional positive-(semi)definite \(p \times p\) matrix inducing *generalised* PLS. `NULL` ⇒ identity.

nonneg

Logical, force non-negative loadings when penalty = "l1". Note: This option is currently ignored when penalty = "ridge".

tol

Relative tolerance for the inner iterations convergence check. Default `1e-6`.

maxiter

Maximum number of inner iterations per component. Default `200`.

verbose

Logical; print progress messages during component extraction. Default `FALSE`.

preproc_x, preproc_y

Optional multivarious preprocessing objects (see fit_transform). By default they pass the data through unchanged using pass().

...

Further arguments (e.g., custom stopping criteria if implemented) are stored in the returned object (they are not used by fit_rpls).

Value

An object of class c("rpls","cross_projector","projector") with at least the elements

vx

\(p \times K\) matrix of X-loadings.

vy

\(q \times K\) matrix of Y-loadings.

ncomp

Number of components actually extracted (may be < K).

penalty

Penalty type used (`"l1"` or `"ridge"`).

preproc_x, preproc_y

Pre-processing transforms used.

...

Other parameters like `lambda`, `tol`, `maxiter`, `nonneg`, `Q` indicator, `verbose` are also stored.

The object supports predict(), project(), transfer(), coef() and other multivarious generics.

Details

Unlike `genpls()` from genplsr.R, which handles separate row and column metrics (`Mx`, `Ax`, `My`, `Ay`) with a Gram–Schmidt orthogonalisation step, `rpls()` uses a single metric `Q` and the simpler penalised updates of Allen et al.

Method

The routine follows Algorithm 1 of Allen *et al.* (2013, *Stat. Anal. Data Min.*, 6 : 302–314) — see the paper for details. Briefly, each component maximises $$\max_{u,v}\; v^\top Q M u - \lambda \, P(v)$$ with \(Q = I_p\) for standard RPLS. The alternating updates are: \(u \leftarrow M^\top Q v / \|M^\top Q v\|_2\), then a penalised (possibly non-negative) regression for \(v\), normalised in the \(Q\)-norm.

References

Allen, G. I., Peterson, C., Vannucci, M., & Maletić-Savatić, M. (2013). *Regularized Partial Least Squares with an Application to NMR Spectroscopy.* **Statistical Analysis and Data Mining, 6(4)**, 302-314. DOI:10.1002/sam.11169.

Examples

# Generate sample data
set.seed(123)
n <- 50
p <- 20
q <- 10
X <- matrix(rnorm(n * p), n, p)
Y <- X[, 1:5] %*% matrix(rnorm(5 * q), 5, q) + matrix(rnorm(n * q), n, q)

# Fit regularized PLS with L1 penalty
fit_l1 <- rpls(X, Y, K = 3, lambda = 0.1, penalty = "l1")
print(fit_l1)
#> cross projector:  rpls cross_projector projector 
#> input dim (X):  20 
#> output dim (X):  3 
#> input dim (Y):  10 
#> output dim (Y):  3 

# Fit regularized PLS with ridge penalty
fit_ridge <- rpls(X, Y, K = 3, lambda = 0.1, penalty = "ridge")
print(fit_ridge)
#> cross projector:  rpls cross_projector projector 
#> input dim (X):  20 
#> output dim (X):  3 
#> input dim (Y):  10 
#> output dim (Y):  3