EMclust                package:mclust                R Documentation

_B_I_C _f_o_r _M_o_d_e_l-_B_a_s_e_d _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     BIC for EM initialized by hierarchical clustering for
     parameterized Gaussian mixture models.

_U_s_a_g_e:

     EMclust(data, G, emModelNames, hcPairs, subset, eps, tol, itmax, equalPro,
             warnSingular, ...)

_A_r_g_u_m_e_n_t_s:

    data: A numeric vector, matrix, or data frame of observations.
          Categorical variables are not allowed. If a matrix or data
          frame, rows correspond to observations and columns correspond
          to variables.  

       G: An integer vector specifying the numbers of mixture
          components (clusters) for which the BIC is to be calculated.
          The default is '1:9'.  

emModelNames: A vector of character strings indicating the models to be
          fitted  in the EM phase of clustering. Possible models: 

               "E" for spherical, equal variance (one-dimensional) 
           "V" for spherical, variable variance (one-dimensional) 
           "EII": spherical, equal volume 
           "VII": spherical, unequal volume 
           "EEI": diagonal, equal volume, equal shape 
           "VEI": diagonal, varying volume, equal shape 
           "EVI": diagonal, equal volume, varying shape 
           "VVI": diagonal, varying volume, varying shape 
           "EEE": ellipsoidal, equal volume, shape, and orientation 
           "EEV": ellipsoidal, equal volume and equal shape
           "VEV": ellipsoidal, equal shape 
           "VVV": ellipsoidal, varying volume, shape, and orientation 

           The default is '.Mclust\$emModelNames'. 

 hcPairs: A matrix of merge pairs for hierarchical clustering such as
          produced by function 'hc'. The default is to compute a
          hierarchical clustering tree by applying function 'hc' with
          'modelName = .Mclust\$hcModelName[1]' to univariate data and
          'modelName = .Mclust\$hcModelName[2]' to multivariate data or
          a subset as indicated by the 'subset' argument. The
          hierarchical clustering results are used as starting values
          for EM.   

  subset: A logical or numeric vector specifying the indices of a
          subset of the data to be used in the initial hierarchical
          clustering phase. 

     eps: A scalar tolerance for deciding when to terminate
          computations due to computational singularity in covariances.
          Smaller values of 'eps' allow computations to proceed nearer
          to singularity. The default is '.Mclust\$eps'.  

     tol: A scalar tolerance for relative convergence of the
          loglikelihood.  The default is '.Mclust\$tol'. 

   itmax: An integer limit on the number of EM iterations.  The default
          is '.Mclust\$itmax'. 

equalPro: Logical variable indicating whether or not the mixing
          proportions are equal in the model. The default is
          '.Mclust\$equalPro'. 

warnSingular: A logical value indicating whether or not a warning
          should be issued whenever a singularity is encountered. The
          default is 'warnSingular=FALSE'. 

    ... : Provided to allow lists with elements other than the
          arguments can be passed in indirect or list calls with
          'do.call'. 

_V_a_l_u_e:

     Bayesian Information Criterion for the specified mixture models
     numbers of clusters. Auxiliary information returned as attributes.

_R_e_f_e_r_e_n_c_e_s:

     C. Fraley and A. E. Raftery (2002a). Model-based clustering,
     discriminant analysis, and density estimation. _Journal of the
     American Statistical Association 97:611:631_.  See <URL:
     http://www.stat.washington.edu/mclust>.

     C. Fraley and A. E. Raftery (2002b). MCLUST:Software for
     model-based clustering, density estimation and discriminant
     analysis.  Technical Report, Department of Statistics, University
     of Washington.  See <URL: http://www.stat.washington.edu/mclust>.

_S_e_e _A_l_s_o:

     'summary.EMclust',  'EMclustN',  'hc', 'me', 'mclustOptions'

_E_x_a_m_p_l_e_s:

     data(iris)
     irisMatrix <- as.matrix(iris[,1:4])

     irisBic <- EMclust(irisMatrix)
     irisBic
     plot(irisBic)

     irisBic <- EMclust(irisMatrix, subset = sample(1:nrow(irisMatrix), 100))
     irisBic
     plot(irisBic)

