Mclust                package:mclust                R Documentation

_M_o_d_e_l-_B_a_s_e_d _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Clustering via EM initialized by hierarchical clustering for
     parameterized Gaussian mixture models. The number of clusters and
     the clustering model is chosen to maximize the BIC.

_U_s_a_g_e:

     Mclust(data, minG, maxG)

_A_r_g_u_m_e_n_t_s:

    data: A numeric vector, matrix, or data frame of observations.
          Categorical variables are not allowed. If a matrix or data
          frame, rows correspond to observations and columns correspond
          to variables. 

    minG: An integer vector specifying the minimum number of mixture
          components  (clusters) to be considered. The default is '1'
          component. 

    maxG: An integer vector specifying the maximum number of mixture
          components  (clusters) to be considered. The default is '9'
          components. 

_V_a_l_u_e:

     A list representing the best model (according to BIC) for the
     given range of numbers of clusters. The following components are
     included:  

     BIC: A matrix giving the BIC value for each model (rows) and
          number of clusters (columns). 

     bic: A scalar giving the optimal BIC value. 

modelName: The MCLUST name for the best model according to BIC.  

classification: The classification corresponding to the optimal BIC
          value.  

uncertainty: The  uncertainty in the classification corresponding to
          the optimal BIC value. 

      mu: For multidimensional models, a matrix whose columns are the
          means of each group in the best model. For one-dimensional
          models, a vector whose entries are the means for each group
          in the best model.  

   sigma: For multidimensional models, a three dimensional array in
          which 'sigma[,,k]' gives the covariance for the _k_th group
          in the best model. For one-dimensional models, either a
          scalar giving a common variance for the groups or a vector
          whose entries are the variances for each group in the best
          model. 

     pro: The mixing probabilities for each component in the best
          model. 

       z: A matrix whose _[i,k]_th entry is the probability that
          observation _i_ belongs to the _k_ component in the model.
          The optimal classification is derived from this, chosing the
          class to be the one giving the maximum probability. 

  loglik: The log likelihood for the data under the best model. 

_D_e_t_a_i_l_s:

     The following models are compared in 'Mclust': 

       "E" for spherical, equal variance (one-dimensional) 
      "V" for spherical, variable variance (one-dimensional) 

      "EII": spherical, equal volume 
      "VII": spherical, unequal volume 
      "EEI": diagonal, equal volume, equal shape 
      "VVI": diagonal, varying volume, varying shape 
      "EEE": ellipsoidal, equal volume, shape, and orientation 
      "VVV": ellipsoidal, varying volume, shape, and orientation 

      'Mclust' is intended to combine 'EMclust' and its 'summary' in a
     simiplified one-step model-based clustering function. The latter
     provide more flexibility including choice of models.

_R_e_f_e_r_e_n_c_e_s:

     C. Fraley and A. E. Raftery (2002a). Model-based clustering,
     discriminant analysis, and density estimation. _Journal of the
     American Statistical Association 97:611-631_.  See <URL:
     http://www.stat.washington.edu/mclust>.

     C. Fraley and A. E. Raftery (2002b). MCLUST:Software for
     model-based clustering, density estimation and discriminant
     analysis.  Technical Report, Department of Statistics, University
     of Washington.  See <URL: http://www.stat.washington.edu/mclust>.

_S_e_e _A_l_s_o:

     'plot.Mclust', 'EMclust'

_E_x_a_m_p_l_e_s:

     data(iris)
     irisMatrix <- as.matrix(iris[,1:4])
     irisClass <- iris[,5]
     irisMclust <- Mclust(irisMatrix)

     ## Not run: plot(irisMclust,irisMatrix)

