coordProj               package:mclust               R Documentation

_C_o_o_r_d_i_n_a_t_e _p_r_o_j_e_c_t_i_o_n_s _o_f _d_a_t_a _i_n _m_o_r_e _t_h_a_n _t_w_o _d_i_m_e_n_s_i_o_n_s _m_o_d_e_l_l_e_d _b_y _a_n _M_V_N
_m_i_x_t_u_r_e.

_D_e_s_c_r_i_p_t_i_o_n:

     Plots coordinate projections given data in more than two
     dimensions and parameters of an MVN mixture model for the data.

_U_s_a_g_e:

     coordProj(data, ..., dimens = c(1, 2),
               type = c("classification","uncertainty","errors"), ask = TRUE,
               quantiles = c(0.75, 0.95), symbols, scale = FALSE,
               identify = FALSE, CEX = 1, PCH = ".", xlim, ylim)

_A_r_g_u_m_e_n_t_s:

    data: A numeric matrix or data frame of observations. Categorical
          variables are not allowed. If a matrix or data frame, rows
          correspond to observations and columns correspond to
          variables. 

  dimens: A vector of length 2 giving the integer dimensions of the
          desired coordinate projections. The default is 'c(1,2)', in
          which the first dimension is plotted against the second. 

     ...: One or more of the following:

          _c_l_a_s_s_i_f_i_c_a_t_i_o_n A numeric or character vector representing a
               classification of observations (rows) of 'data'.

          _u_n_c_e_r_t_a_i_n_t_y A numeric vector of values in _(0,1)_ giving the
               uncertainty of each data point.

          _z A matrix in which the '[i,k]'th entry gives the probability
               of observation _i_ belonging to the _k_th class.  Used
               to compute 'classification' and 'uncertainty' if those
               arguments aren't available.

          _t_r_u_t_h A numeric or character vector giving a known
               classification of each data point. If 'classification'
               or 'z' is also present,  this is used for displaying
               classification errors.

          _m_u A matrix whose columns are the means of each group. 

          _s_i_g_m_a A three dimensional array  in which 'sigma[,,k]' gives
               the covariance for the _k_th group.

          _d_e_c_o_m_p A list with 'scale', 'shape' and 'orientation'
               components giving an alternative form for the covariance
               structure of the mixture model. 

    type: Any subset of  'c("classification","uncertainty","errors")'.
          The function will produce the corresponding plot if it has
          been supplied sufficient information to do so. If more than
          one plot is possible then users will be asked to choose from
          a menu if 'ask=TRUE'.  

     ask: A logical variable indicating whether or not a menu should be
          produced when more than one plot is possible. The default is
          'ask=TRUE'.  

quantiles: A vector of length 2 giving quantiles used in plotting
          uncertainty. The smallest symbols correspond to the smallest
          quantile (lowest uncertainty), medium-sized (open) symbols to
          points falling between the given quantiles, and large
          (filled) symbols to those in the largest quantile (highest
          uncertainty). The default is _(0.75,0.95)_.  

 symbols: Either an integer or character vector assigning a plotting
          symbol to each unique class in 'classification'. Elements in
          'symbols' correspond to classes in 'classification' in sorted
          order. Default: If _G_ is the number of groups in the
          classification, the first _G_ symbols in '.Mclust\$symbols',
          otherwise if _G_ is less than 27 then the first _G_ capital
          letters in the Roman alphabet.  

   scale: A logical variable indicating whether or not the two chosen
          dimensions should be plotted on the same scale, and thus
          preserve the shape of the distribution. Default:
          'scale=FALSE'  

identify: A logical variable indicating whether or not to add a title
          to the plot identifying the dimensions used. 

     CEX: An argument specifying the size of the plotting symbols.  The
          default value is 1. 

     PCH: An argument specifying the symbol to be used when a
          classificatiion has not been specified for the data. The
          default value is a small dot ".". 

xlim, ylim: Arguments specifying bounds for the ordinate, abscissa of
          the plot. This may be useful for when comparing plots. 

_S_i_d_e _E_f_f_e_c_t_s:

     Coordinate projections of the data, possibly showing location of
     the mixture components, classification, uncertainty, and/or
     classification errors.

_R_e_f_e_r_e_n_c_e_s:

     C. Fraley and A. E. Raftery (2002). Model-based clustering,
     discriminant analysis, and density estimation. _Journal of the
     American Statistical Association 97:611-631_.  See <URL:
     http://www.stat.washington.edu/mclust>.

     C. Fraley and A. E. Raftery (2002). MCLUST:Software for
     model-based clustering, density estimation and discriminant
     analysis. Technical Report, Department of Statistics, University
     of Washington.  See <URL: http://www.stat.washington.edu/mclust>.

_S_e_e _A_l_s_o:

     'clPairs', 'randProj', 'mclust2Dplot', 'mclustOptions', 'do.call'

_E_x_a_m_p_l_e_s:

     data(iris)
     irisMatrix <- as.matrix(iris[,1:4])
     irisClass <- iris[,5]

     msEst <- mstepVVV(irisMatrix, unmap(irisClass))

     par(pty = "s", mfrow = c(1,2))
     coordProj(irisMatrix,dimens=c(2,3), truth = irisClass, 
               mu = msEst$mu, sigma = msEst$sigma, z = msEst$z)
     do.call("coordProj", c(list(data=irisMatrix, dimens=c(2,3), truth=irisClass),
                            msEst))

