mclust2Dplot             package:mclust             R Documentation

_P_l_o_t _t_w_o-_d_i_m_e_n_s_i_o_n_a_l _d_a_t_a _m_o_d_e_l_l_e_d _b_y _a_n _M_V_N _m_i_x_t_u_r_e.

_D_e_s_c_r_i_p_t_i_o_n:

     Plot two-dimensional data given parameters of an MVN mixture model
      for the data.

_U_s_a_g_e:

     mclust2Dplot(data, ...,
                  type = c("classification","uncertainty","errors"), ask = TRUE,
                  quantiles = c(0.75, 0.95), symbols, scale = FALSE,
                  identify = FALSE, CEX = 1, PCH = ".", xlim, ylim,
                  swapAxes = FALSE) 

_A_r_g_u_m_e_n_t_s:

    data: A numeric matrix or data frame of observations. Categorical
          variables are not allowed. If a matrix or data frame, rows
          correspond to observations and columns correspond to
          variables.  In this case the data are two dimensional, so
          there are two columns. 

     ...: One or more of the following:

          _c_l_a_s_s_i_f_i_c_a_t_i_o_n A numeric or character vector representing a
               classification of observations (rows) of 'data'.

          _u_n_c_e_r_t_a_i_n_t_y A numeric vector of values in _(0,1)_ giving the
               uncertainty of each data point.

          _z A matrix in which the _[i,k]_the entry gives the
               probability of observation _i_ belonging to the _k_th
               class.  Used to compute 'classification' and
               'uncertainty' if those arguments aren't available.

          _t_r_u_t_h A numeric or character vector giving a known
               classification of each data point. If 'classification'
               or 'z' is also present, this   is used for displaying
               classification errors.

          _m_u A matrix whose columns are the means of each group. 

          _s_i_g_m_a A three dimensional array  in which 'sigma[,,k]' gives
               the covariance for the _k_th group.

          _d_e_c_o_m_p A list with 'scale', 'shape' and  'orientation'
               components giving an alternative form for the covariance
               structure  of the mixture model.

    type: Any subset of  'c("classification","uncertainty","errors")'. 
          The function will produce the corresponding plot if it has
          been supplied sufficient information to do so. If more than
          one plot is possible then users will be asked to choose from
          a menu if 'ask=TRUE'. 

     ask: A logical variable indicating whether or not a menu should be
          produced when more than one plot is possible.  The default is
          'ask=TRUE'. 

quantiles: A vector of length 2 giving quantiles used in plotting
          uncertainty. The smallest symbols correspond to the smallest
          quantile  (lowest uncertainty), medium-sized (open) symbols
          to points falling between  the given quantiles, and large
          (filled) symbols to those in the largest  quantile (highest
          uncertainty). The default is _(0.75,0.95)_. 

 symbols: Either an integer or character vector assigning a plotting
          symbol to each unique class 'classification'.  Elements in
          'symbols' correspond to classes in 'classification' in order
          of appearance in the observations (the order used by the
          S-PLUS function  'unique').  Default: If _G_ is the number of
          groups in the classification, the first _G_ symbols in
          '.Mclust\$symbols', otherwise if _G_ is less than 27 then the
          first _G_ capital letters in the Roman alphabet. 

   scale: A logical variable indicating whether or not the two chosen
          dimensions should be plotted on the same scale, and thus
          preserve the shape of the distribution. Default:
          'scale=FALSE'  

identify: A logical variable indicating whether or not to add a title
          to the plot identifying the dimensions used. 

     CEX: An argument specifying the size of the plotting symbols.  The
          default value is 1. 

     PCH: An argument specifying the symbol to be used when a
          classificatiion has not been specified for the data. The
          default value is a small dot ".". 

xlim, ylim : An argument specifying bounds for the ordinate, abscissa
          of the plot. This may be useful for when comparing plots. 

swapAxes: A logical variable indicating whether or not the axes should
          be swapped for the plot. 

_S_i_d_e _E_f_f_e_c_t_s:

     One or more plots showing location of the mixture components,
     classification, uncertainty, and/or classification errors.

_R_e_f_e_r_e_n_c_e_s:

     C. Fraley and A. E. Raftery (2002). Model-based clustering,
     discriminant analysis, and density estimation. _Journal of the
     American Statistical Association 97:611-631_.  See <URL:
     http://www.stat.washington.edu/mclust>.

     C. Fraley and A. E. Raftery (2002). MCLUST:Software for
     model-based clustering, density estimation and discriminant
     analysis. Technical Report, Department of Statistics, University
     of Washington.  See <URL: http://www.stat.washington.edu/mclust>.

_S_e_e _A_l_s_o:

     'surfacePlot', 'clPairs', 'coordProj', 'randProj', 'spinProj',
     'mclustOptions', 'do.call'

_E_x_a_m_p_l_e_s:

     n <- 250 ## create artificial data
     set.seed(0)
     x <- rbind(matrix(rnorm(n*2), n, 2) %*% diag(c(1,9)),
                matrix(rnorm(n*2), n, 2) %*% diag(c(1,9))[,2:1])
     xclass <- c(rep(1,n),rep(2,n))

     xEMclust <- summary(EMclust(x),x)

     mclust2Dplot(x, truth = xclass, z = xEMclust$z, ask=FALSE,
                     mu = xEMclust$mu, sigma = xEMclust$sigma)

     do.call("mclust2Dplot", c(list(data = x, truth = xclass, ask=FALSE), xEMclust))

