| MbkmeansParam-class {bluster} | R Documentation |
Run the mini-batch k-means mbkmeans function with the specified number of centers within clusterRows.
This sacrifices some accuracy for speed compared to the standard k-means algorithm.
Note that this requires installation of the mbkmeans package.
MbkmeansParam( centers, batch_size = NULL, max_iters = 100, num_init = 1, init_fraction = NULL, initializer = "kmeans++", calc_wcss = FALSE, early_stop_iter = 10, tol = 1e-04, BPPARAM = SerialParam() ) ## S4 method for signature 'ANY,MbkmeansParam' clusterRows(x, BLUSPARAM, full = FALSE)
centers |
An integer scalar specifying the number of centers. Alternatively, a function that takes the number of observations and returns the number of centers. |
batch_size, max_iters, num_init, init_fraction, initializer, calc_wcss, early_stop_iter, tol, BPPARAM |
Further arguments to pass to |
x |
A numeric matrix-like object where rows represent observations and columns represent variables. |
BLUSPARAM |
A MbkmeansParam object. |
full |
Logical scalar indicating whether the full mini-batch k-means statistics should be returned. |
This class usually requires the user to specify the number of clusters beforehand. However, we can also allow the number of clusters to vary as a function of the number of observations. The latter is occasionally useful, e.g., to allow the clustering to automatically become more granular for large datasets.
To modify an existing MbkmeansParam object x,
users can simply call x[[i]] or x[[i]] <- value where i is any argument used in the constructor.
For batch_size and init_fraction, a value of NULL means that the default arguments in the mbkmeans function signature are used.
These defaults are data-dependent and so cannot be specified during construction of the MbkmeansParam object, but instead are defined within the clusterRows method.
The MbkmeansParam constructor will return a MbkmeansParam object with the specified parameters.
The clusterRows method will return a factor of length equal to nrow(x) containing the cluster assignments.
If full=TRUE, a list is returned with clusters (the factor, as above) and objects
(a list containing mbkmeans, the direct output of mbkmeans).
Stephanie Hicks
mbkmeans from the mbkmeans package, which actually does all the heavy lifting.
KmeansParam, for dispatch to the standard k-means algorithm.
clusterRows(iris[,1:4], MbkmeansParam(centers=3)) clusterRows(iris[,1:4], MbkmeansParam(centers=3, batch_size=10)) clusterRows(iris[,1:4], MbkmeansParam(centers=3, init_fraction=0.5))