| skKMeans {BiocSklearn} | R Documentation |
interface to sklearn.cluster.KMeans with attention to direct work with HDF5
skKMeans(mat, ...)
mat |
a matrix-like datum or reference to such |
... |
arguments to sklearn.cluster.KMeans |
You can use 'py_help(SklearnEls()$skcl$KMeans)' to get python documentation on parameters and return structure.
# start with numpy array reference as data
irloc = system.file("csv/iris.csv", package="BiocSklearn")
skels = SklearnEls()
irismat = skels$np$genfromtxt(irloc, delimiter=',')
ans = skKMeans(irismat, n_clusters=2L)
names(ans) # names of available result components
table(iris$Species, ans$labels_)
# now use an HDF5 reference
irh5 = system.file("hdf5/irmat.h5", package="BiocSklearn")
fref = skels$h5py$File(irh5)
ds = fref$`__getitem__`("quants") # thanks Samuela Pollack!
ans2 = skKMeans(skels$np$array(ds)$T, n_clusters=2L) # HDF5 matrix is transposed relative to python array layout! Is the np$array conversion unduly costly?
table(ans$labels_, ans2$labels_)
ans3 = skKMeans(skels$np$array(ds)$T,
n_clusters=8L, max_iter=200L,
algorithm="full", random_state=20L)