diana2means             package:adSplit             R Documentation

_2-_M_e_a_n_s _w_i_t_h _H_i_e_r_a_r_c_h_i_c_a_l _I_n_i_t_i_a_l_i_z_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Split a set of data points into two coherent groups using the
     k-means algorithm. Instead of random initialization, divisive
     hierarchical clustering is used to determine initial groups and
     the corresponding centroids.

_U_s_a_g_e:

     diana2means(mydata, mingroupsize = 5, 
                 ngenes = 50, ignore.genes = 5, 
                 return.cut = FALSE)

_A_r_g_u_m_e_n_t_s:

  mydata: either an expression set as defined by the package 'Biobase'
          or a matrix of expression levels (rows=genes,
          columns=samples).

mingroupsize: report only splits where both groups are larger than this
          size.

  ngenes: number of genes used to compute cluster quality DLD-score.

ignore.genes: number of best scoring genes to be ignored when computing
          DLD-scores.

return.cut: logical, whether to actuelly return the attributions of
          samples to groups.

_D_e_t_a_i_l_s:

     This function uses divisive hierarchical clustering (diana) to
     generate a first split of the data. Thereby, each column of the
     data matrix is considered to represent a data element. From the
     thus generated temptative groups, centroids are deduced and used
     to initialize the k-means clustering algorithm.

     For the split optimized by k-means the DLD-score is determined
     using the 'ngenes' and 'ignore.genes' arguments.

_V_a_l_u_e:

     If the logical 'return.cut' is set to 'FALSE' (the default), a
     single number is representing the DLD-score for the generated
     split is returned. Otherwise an object of class 'split' containing
     the following elements is returned: 

     cut: one number out of 0 and 1 per column in the original data,
          specifying the split attribution.

   score: the DLD-score achieved by the split.

_A_u_t_h_o_r(_s):

     Joern Toedling, Claudio Lottaz

_S_e_e _A_l_s_o:

     'diana'

_E_x_a_m_p_l_e_s:

     # get golub data
     library(vsn)
     library(golubEsets)
     data(Golub_Merge)

     # use 10
     e <- exprs(Golub_Merge)
     vars <- apply(e, 1, var)
     e <- e[vars > quantile(vars,0.9),]

     # use diana2means to get splits and scores
     diana2means(e)
     diana2means(e, return.cut=TRUE)

