Build expected-domain features from a soft alignment matrix
Source:R/feature_sets.R
expected_features.RdGiven encoding-domain predictors and a recall->encoding alignment posterior, compute expected recall-domain predictors: $$X_{rec} = \Gamma X_{enc}.$$
Arguments
- train
feature_sets object for encoding-domain predictors.
- gamma
Numeric matrix of shape (T_rec x T_enc) or (T_rec x (T_enc+1)) if null column present.
- drop_null
Logical; if TRUE and gamma has T_enc+1 columns, drop the first column.
- renormalize
Logical; if TRUE, renormalize rows to sum to 1 after dropping null.
- eps
Small constant to avoid division by zero in renormalization.
Details
This is the core "soft label" trick for bringing recall into regression when recall TRs do not have a known one-to-one correspondence with encoding TRs.
Gamma shapes. `gamma` should be a numeric matrix where rows index recall TRs and columns index encoding TRs:
without a NULL state:
(T_rec x T_enc)with a NULL state in the first column:
(T_rec x (T_enc+1))
When a NULL column is present and `drop_null = TRUE`, the NULL column is dropped.
If `renormalize = FALSE` (default), the remaining row mass is stored as
row_weights (so uncertain TRs with high NULL probability can be
down-weighted by downstream models). If `renormalize = TRUE`, rows are
renormalized to sum to 1 and `row_weights` is set to 1.