| fitGrid {scPCA} | R Documentation |
Identify the Optimal Contrastive and Penalty Parameters
Description
This function is used to automatically select the optimal
contrastive parameter and L1 penalty term for scPCA based on a clustering
algorithm and average silhouette width.
Usage
fitGrid(
target,
target_valid = NULL,
center,
scale,
c_contrasts,
contrasts,
alg,
penalties,
n_eigen,
clust_method = c("kmeans", "pam", "hclust"),
n_centers,
max_iter = 10,
linkage_method = "complete",
clusters = NULL,
eigdecomp_tol = 1e-10,
eigdecomp_iter = 1000
)
Arguments
target |
The target (experimental) data set, in a standard format such
as a data.frame or matrix.
|
target_valid |
A holdout set of the target (experimental) data set, in
a standard format such as a data.frame or matrix. NULL
by default but used by cvSelectParams for cross-validated
selection of the contrastive and penalization parameters.
|
center |
A logical indicating whether the target and background
data sets should be centered to mean zero.
|
scale |
A logical indicating whether the target and background
data sets should be scaled to unit variance.
|
c_contrasts |
A list of contrastive covariances.
|
contrasts |
A numeric vector of the contrastive parameters used
to compute the contrastive covariances.
|
alg |
A character indicating the SPCA algorithm used to sparsify
the contrastive loadings. Currently supports iterative for the
Zou et al. (2006) implemententation,
var_proj for the non-randomized
Erichson et al. (2018) solution, and
rand_var_proj for the randomized
Erichson et al. (2018) result.
|
penalties |
A numeric vector of the penalty terms.
|
n_eigen |
A numeric indicating the number of eigenvectors to be
computed.
|
clust_method |
A character specifying the clustering method to
use for choosing the optimal constrastive parameter. Currently, this is
limited to either k-means, partitioning around medoids (PAM), and
hierarchical clustering. The default is k-means clustering.
|
n_centers |
A numeric giving the number of centers to use in the
clustering algorithm.
|
max_iter |
A numeric giving the maximum number of iterations to
be used in k-means clustering, defaulting to 10.
|
linkage_method |
A character specifying the agglomerative
linkage method to be used if clust_method = "hclust". The options
are ward.D2, single, complete, average,
mcquitty, median, and centroid. The default is
complete.
|
clusters |
A numeric vector of cluster labels for observations in
the target data. Defaults to NULL, but is otherwise used to
identify the optimal set of hyperparameters when fitting the scPCA and the
automated version of cPCA.
|
eigdecomp_tol |
A numeric providing the level of precision used by
eigendecompositon calculations. Defaults to 1e-10.
|
eigdecomp_iter |
A numeric indicating the maximum number of
interations performed by eigendecompositon calculations. Defaults to
1000.
|
Value
A list similar to that output by prcomp:
rotation - the matrix of variable loadings
x - the rotated data, centred and scaled, if requested, data
multiplied by the rotation matrix
contrast - the optimal contrastive parameter
penalty - the optimal L1 penalty term
References
Erichson NB, Zeng P, Manohar K, Brunton SL, Kutz JN, Aravkin AY (2018).
“Sparse Principal Component Analysis via Variable Projection.”
ArXiv, abs/1804.00341.
Zou H, Hastie T, Tibshirani R (2006).
“Sparse principal component analysis.”
Journal of computational and graphical statistics, 15(2), 265–286.
[Package
scPCA version 1.6.2
Index]