GaussianMixture¶
- 
class pyspark.mllib.clustering.GaussianMixture[source]¶
- Learning algorithm for Gaussian Mixtures using the expectation-maximization algorithm. - New in version 1.3.0. - Methods - train(rdd, k[, convergenceTol, …])- Train a Gaussian Mixture clustering model. - Methods Documentation - 
classmethod train(rdd: pyspark.rdd.RDD[VectorLike], k: int, convergenceTol: float = 0.001, maxIterations: int = 100, seed: Optional[int] = None, initialModel: Optional[pyspark.mllib.clustering.GaussianMixtureModel] = None) → pyspark.mllib.clustering.GaussianMixtureModel[source]¶
- Train a Gaussian Mixture clustering model. - New in version 1.3.0. - Parameters
- rdd:pyspark.RDD
- Training points as an RDD of - pyspark.mllib.linalg.Vectoror convertible sequence types.
- kint
- Number of independent Gaussians in the mixture model. 
- convergenceTolfloat, optional
- Maximum change in log-likelihood at which convergence is considered to have occurred. (default: 1e-3) 
- maxIterationsint, optional
- Maximum number of iterations allowed. (default: 100) 
- seedint, optional
- Random seed for initial Gaussian distribution. Set as None to generate seed based on system time. (default: None) 
- initialModelGaussianMixtureModel, optional
- Initial GMM starting point, bypassing the random initialization. (default: None) 
 
- rdd:
 
 
- 
classmethod