Implements the algorithm of Allen *et al.* (2013) for supervised dimension-reduction with optional sparsity (\(\ell_1\)) or ridge (\(\ell_2\)) penalties **and** the generalised extension that operates in a user-supplied quadratic form \(Q\).
Usage
fit_rpls(
X,
Y,
K = 2,
lambda = 0.1,
penalty = c("l1", "ridge"),
Q = NULL,
nonneg = FALSE,
tol = 1e-06,
maxiter = 200,
verbose = FALSE
)
rpls(
X,
Y,
K = 2,
lambda = 0.1,
penalty = c("l1", "ridge"),
Q = NULL,
nonneg = FALSE,
preproc_x = multivarious::pass(),
preproc_y = multivarious::pass(),
tol = 1e-06,
maxiter = 200,
verbose = FALSE,
...
)Arguments
- X
Numeric matrix \((n \times p)\) — predictors.
- Y
Numeric matrix \((n \times q)\) — responses.
- K
Integer, number of latent factors to extract. Default `2`.
- lambda
Scalar or length-
Knumeric vector of penalties.- penalty
Either `"l1"` (lasso) or `"ridge"`.
- Q
Optional positive-(semi)definite \(p \times p\) matrix inducing *generalised* PLS. `NULL` ⇒ identity.
- nonneg
Logical, force non-negative loadings when
penalty = "l1". Note: This option is currently ignored whenpenalty = "ridge".- tol
Relative tolerance for the inner iterations convergence check. Default `1e-6`.
- maxiter
Maximum number of inner iterations per component. Default `200`.
- verbose
Logical; print progress messages during component extraction. Default `FALSE`.
- preproc_x, preproc_y
Optional multivarious preprocessing objects (see
fit_transform). By default they pass the data through unchanged usingpass().- ...
Further arguments (e.g., custom stopping criteria if implemented) are stored in the returned object (they are not used by
fit_rpls).
Value
An object of class c("rpls","cross_projector","projector")
with at least the elements
- vx
\(p \times K\) matrix of X-loadings.
- vy
\(q \times K\) matrix of Y-loadings.
- ncomp
Number of components actually extracted (may be < K).
- penalty
Penalty type used (`"l1"` or `"ridge"`).
- preproc_x, preproc_y
Pre-processing transforms used.
- ...
Other parameters like `lambda`, `tol`, `maxiter`, `nonneg`, `Q` indicator, `verbose` are also stored.
The object supports predict(), project(),
transfer(), coef() and other multivarious generics.
Details
Unlike `genpls()` from genplsr.R, which handles separate row and
column metrics (`Mx`, `Ax`, `My`, `Ay`) with a Gram–Schmidt orthogonalisation
step, `rpls()` uses a single metric `Q` and the simpler penalised updates of
Allen et al.
Method
The routine follows Algorithm 1 of Allen *et al.* (2013, *Stat. Anal. Data Min.*, 6 : 302–314) — see the paper for details. Briefly, each component maximises $$\max_{u,v}\; v^\top Q M u - \lambda \, P(v)$$ with \(Q = I_p\) for standard RPLS. The alternating updates are: \(u \leftarrow M^\top Q v / \|M^\top Q v\|_2\), then a penalised (possibly non-negative) regression for \(v\), normalised in the \(Q\)-norm.
References
Allen, G. I., Peterson, C., Vannucci, M., & Maletić-Savatić, M. (2013). *Regularized Partial Least Squares with an Application to NMR Spectroscopy.* **Statistical Analysis and Data Mining, 6(4)**, 302-314. DOI:10.1002/sam.11169.
Examples
# Generate sample data
set.seed(123)
n <- 50
p <- 20
q <- 10
X <- matrix(rnorm(n * p), n, p)
Y <- X[, 1:5] %*% matrix(rnorm(5 * q), 5, q) + matrix(rnorm(n * q), n, q)
# Fit regularized PLS with L1 penalty
fit_l1 <- rpls(X, Y, K = 3, lambda = 0.1, penalty = "l1")
print(fit_l1)
#> cross projector: rpls cross_projector projector
#> input dim (X): 20
#> output dim (X): 3
#> input dim (Y): 10
#> output dim (Y): 3
# Fit regularized PLS with ridge penalty
fit_ridge <- rpls(X, Y, K = 3, lambda = 0.1, penalty = "ridge")
print(fit_ridge)
#> cross projector: rpls cross_projector projector
#> input dim (X): 20
#> output dim (X): 3
#> input dim (Y): 10
#> output dim (Y): 3