Skip to contents

Construct a sparse k-nearest-neighbor (kNN) graph quickly. For L2 distance, this uses `RcppHNSW` when available (approximate, very fast) and falls back to `Rnanoflann` (exact) otherwise.

Usage

graph_weights_fast(
  X,
  k = 15,
  weight_mode = c("self_tuned", "heat", "normalized", "binary", "euclidean", "cosine",
    "correlation"),
  type = c("normal", "mutual", "asym"),
  backend = c("auto", "hnsw", "nanoflann"),
  sigma = NULL,
  local_k = min(7L, k),
  M = 16,
  ef = 200
)

Arguments

X

A numeric matrix with rows as observations and columns as features.

k

Number of nearest neighbors (per row).

weight_mode

Weighting to convert distances into edge weights. One of `"self_tuned"`, `"heat"`, `"normalized"`, `"binary"`, `"euclidean"`, `"cosine"`, or `"correlation"`.

type

Symmetrization policy. `"normal"` returns a union graph with `max(w_ij, w_ji)`; `"mutual"` returns an intersection graph with `min(w_ij, w_ji)`; `"asym"` returns the directed kNN graph.

backend

Neighbor search backend: `"auto"` (default), `"hnsw"`, or `"nanoflann"`.

sigma

Bandwidth for `"heat"`/`"normalized"` modes. If `NULL`, a robust value is estimated from kNN distances.

local_k

Local neighborhood size used by `"self_tuned"`; defaults to `min(7, k)`.

M, ef

HNSW parameters (only used when `backend="hnsw"`).

Value

A sparse `dgCMatrix` adjacency matrix.

Details

The default weighting mode (`"self_tuned"`) uses a self-tuning heat kernel based on per-point local scale, which tends to work well across varying densities.

Provides additional neighbor search backends (HNSW, nanoflann) and self-tuning sigma estimation not available in graph_weights.

See also

graph_weights for the standard interface

Examples

X <- matrix(rnorm(200), 20, 10)
W <- graph_weights_fast(X, k = 5)