Nyström approximation for kernel-based decomposition (Unified Version)
nystrom_approx.RdApproximate the eigen-decomposition of a large kernel matrix K using either the standard Nyström method (Williams & Seeger, 2001) or the Double Nyström method (Lim et al., 2015, Algorithm 3).
Arguments
- X
A numeric matrix or data frame of size (N x D), where N is number of samples.
- kernel_func
A kernel function with signature
kernel_func(X, Y, ...). If NULL, defaults to a linear kernel:X %*% t(Y).- ncomp
Number of components (eigenvectors/eigenvalues) to return. Cannot exceed the number of landmarks. Default capped at
length(landmarks).- landmarks
A vector of row indices (1-based, from X) specifying the landmark points. If NULL,
nlandmarkspoints are sampled uniformly at random.- nlandmarks
The number of landmark points to sample if
landmarksis NULL. Default is 10.- preproc
A pre-processing pipeline object (e.g., from
prep()) or a pre-processing function (defaultpass()) to apply before computing the kernel.- method
Either "standard" (the classic single-stage Nyström) or "double" (the two-stage Double Nyström method).
- center
Logical. If TRUE, attempts kernel centering. Default FALSE. Note: True kernel centering (required for equivalence to Kernel PCA) is computationally expensive and not fully implemented. Setting
center=TRUEcurrently only issues a warning. For results equivalent to standard PCA, use a linear kernel and center the input dataX(e.g., viapreproc). See Details.- l
Intermediate rank for the double Nyström method. Ignored if
method="standard". Typically,l < length(landmarks)to reduce complexity.- use_RSpectra
Logical. If TRUE, use
RSpectra::svdsfor partial SVD. Recommended for large problems.- ...
Additional arguments passed to
kernel_func.
Value
A bi_projector object with class "nystrom_approx" and additional fields:
vThe eigenvectors (N x ncomp) approximating the kernel eigenbasis.
sThe scores (N x ncomp) = v * diag(sdev), analogous to principal component scores.
sdevThe square roots of the eigenvalues.
preprocThe pre-processing pipeline used.
metaA list containing parameters and intermediate results used (method, landmarks, kernel_func, etc.).
Details
The Double Nyström method introduces an intermediate step that reduces the size of the decomposition problem, potentially improving efficiency and scalability.
Kernel Centering: Standard Kernel PCA requires the kernel matrix K to be centered
in the feature space (Schölkopf et al., 1998). This implementation currently
does not perform kernel centering by default (center=FALSE) due to computational complexity.
Consequently, with non-linear kernels, the results approximate the eigen-decomposition
of the uncentered kernel matrix, and are not strictly equivalent to Kernel PCA.
If using a linear kernel, centering the input data X (e.g., using preproc=prep(center()))
yields results equivalent to standard PCA, which is often sufficient.
Standard Nyström: Uses the method from Williams & Seeger (2001), including the
sqrt(m/N) scaling for eigenvectors and N/m for eigenvalues (m landmarks, N samples).
Double Nyström: Implements Algorithm 3 from Lim et al. (2015).
References
Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural computation, 10(5), 1299-1319.
Williams, C. K. I., & Seeger, M. (2001). Using the Nyström Method to Speed Up Kernel Machines. In Advances in Neural Information Processing Systems 13 (pp. 682-688).
Lim, D., Jin, R., & Zhang, L. (2015). An Efficient and Accurate Nystrom Scheme for Large-Scale Data Sets. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (pp. 2765-2771).
Examples
set.seed(123)
# Smaller example matrix
X <- matrix(rnorm(1000*300), 1000, 300)
# Standard Nyström
res_std <- nystrom_approx(X, ncomp=5, nlandmarks=50, method="standard")
print(res_std)
#> A bi_projector object with the following properties:
#>
#> Dimensions of the weights (v) matrix:
#> Rows: 1000 Columns: 5
#>
#> Dimensions of the scores (s) matrix:
#> Rows: 1000 Columns: 5
#>
#> Length of the standard deviations (sdev) vector:
#> Length: 5
#>
#> Preprocessing information:
#> A finalized pre-processing pipeline:
#> Step 1 : pass
# Double Nyström
res_db <- nystrom_approx(X, ncomp=5, nlandmarks=50, method="double", l=20)
print(res_db)
#> A bi_projector object with the following properties:
#>
#> Dimensions of the weights (v) matrix:
#> Rows: 1000 Columns: 5
#>
#> Dimensions of the scores (s) matrix:
#> Rows: 1000 Columns: 5
#>
#> Length of the standard deviations (sdev) vector:
#> Length: 5
#>
#> Preprocessing information:
#> A finalized pre-processing pipeline:
#> Step 1 : pass
# Projection (using standard result as example)
scores_new <- project(res_std, X[1:10,])
head(scores_new)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] -0.3464966 -2.4798337 0.6040714 -0.62223962 0.5059013
#> [2,] -0.1623986 -0.4074079 -0.8435420 -1.38722424 -0.3488808
#> [3,] 0.6435716 -0.2531178 -0.7252837 0.35537372 -0.7208270
#> [4,] 0.1593062 -0.2865374 0.5394600 -0.02853512 -0.5659083
#> [5,] -1.5810568 -6.8534098 3.2928589 -0.59725008 -0.8724898
#> [6,] -0.6895022 1.3136927 1.0744290 -0.93430486 -1.5582248