hcE                  package:mclust                  R Documentation

_M_o_d_e_l-_b_a_s_e_d _H_i_e_r_a_r_c_h_i_c_a_l _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Agglomerative hierarchical clustering based on maximum likelihood
     for a MVN mixture model parameterized by eigenvalue decomposition.

_U_s_a_g_e:

     hcE(data, partition, minclus=1, ...)
     hcV(data, partition, minclus = 1, alpha = 1, ...)
     hcEII(data, partition, minclus = 1, ...)
     hcVII(data, partition, minclus = 1, alpha = 1, ...)
     hcEEE(data, partition, minclus = 1, ...)
     hcVVV(data, partition, minclus = 1, alpha = 1, beta = 1, ...)

_A_r_g_u_m_e_n_t_s:

    data: A numeric vector, matrix, or data frame of observations.
          Categorical variables are not allowed. If a matrix or data
          frame, rows correspond to observations and columns correspond
          to variables.  

partition: A numeric or character vector representing a partition of
          observations (rows) of 'data'. If provided, group merges will
          start with this partition. Otherwise, each observation is
          assumed to be in a cluster by itself at the start of
          agglomeration.  

 minclus: A number indicating the number of clusters at which to stop
          the agglomeration. The default is to stop when all
          observations have been merged into a single cluster. 

alpha, beta: Additional tuning parameters needed for initializatiion in
          some models.  For details, see Fraley 1998. The defaults
          provided are usually adequate. 

     ...: Catch unused arguments from a 'do.call' call. 

_D_e_t_a_i_l_s:

     Most models have memory usage of the order of the square of the
     number groups in the initial partition for fast execution. Some
     models, such as equal variance or '"EEE"', do not admit a fast
     algorithm under the usual agglomerative hierachical clustering
     paradigm.  These use less memory but are much slower to execute.

_V_a_l_u_e:

     A numeric two-column matrix in which the _i_th row gives the
     minimum  index for observations in each of the two clusters merged
     at the _i_th stage of agglomerative hierarchical clustering.

_R_e_f_e_r_e_n_c_e_s:

     J. D. Banfield and A. E. Raftery (1993). Model-based Gaussian and
     non-Gaussian Clustering. _Biometrics 49:803-821_. 

     C. Fraley (1998). Algorithms for model-based Gaussian hierarchical
     clustering. _SIAM Journal on Scientific Computing 20:270-281_. 
     See <URL: http://www.stat.washington.edu/mclust>. 

     C. Fraley and A. E. Raftery (2002). Model-based clustering,
     discriminant analysis, and density estimation. _Journal of the
     American Statistical Association 97:611-631_.  See <URL:
     http://www.stat.washington.edu/mclust>. 

     C. Fraley and A. E. Raftery (2002). MCLUST:Software for
     model-based clustering, density estimation and discriminant
     analysis.  Technical Report, Department of Statistics, University
     of Washington.  See <URL: http://www.stat.washington.edu/mclust>.

_S_e_e _A_l_s_o:

     'hc', 'hclass'

_E_x_a_m_p_l_e_s:

     data(iris)
     irisMatrix <- as.matrix(iris[,1:4])

     hcTree <- hcEII(data = irisMatrix)
     cl <- hclass(hcTree,c(2,3))

     par(pty = "s", mfrow = c(1,1))
     clPairs(irisMatrix,cl=cl[,"2"])
     clPairs(irisMatrix,cl=cl[,"3"])

     par(mfrow = c(1,2))
     dimens <- c(1,2)
     coordProj(irisMatrix, classification=cl[,"2"], dimens=dimens)
     coordProj(irisMatrix, classification=cl[,"3"], dimens=dimens)

