MCRestimate           package:MCRestimate           R Documentation

_E_s_t_i_m_a_t_i_o_n _o_f _m_i_s_c_l_a_s_s_i_f_i_c_a_t_i_o_n _e_r_r_o_r _b_y _c_r_o_s_s-_v_a_l_i_d_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Several repetitions of a cross-validation are performed to get
     'votes' how stable a method is against different partitions into
     training and test set

_U_s_a_g_e:

     MCRestimate(eset,
                 class.column,
                 reference.class=NULL,
                 classification.fun,
                 variableSel.fun="identity",
                 cluster.fun="identity",
                 poss.parameters=list(),
                 cross.outer=10,
                 cross.repeat=3,
                 cross.inner=cross.outer,
                 plot.label=NULL,
                 rand=123,
                 stratify=FALSE,
                 information=TRUE,
                 thePreprocessingMethods=c(variableSel.fun,cluster.fun))

_A_r_g_u_m_e_n_t_s:

    eset: an object of class 'exprSet' or class 'exprSetRG-class'

class.column: a number or a character string which indicated the column
          of the expression set's phenodata containing the class label

reference.class: a character string - the name of one class - if
          specified, the class will form the first class and all the
          other classes will form the second class 

classification.fun: character string which names the function that
          should be used for the classification

variableSel.fun: character string which names the function that should
          be used for the variable selection

cluster.fun: character string which names the function that should be
          used for the clustering of variables

thePreprocessingMethods: vector of character with the names of all
          preprocessing functions- can be used instead of
          'variableSel.fun' and 'cluster.fun' - see details

poss.parameters: a list of possible values for the parameter of the
          classification, variable selection, and cluster methods

cross.outer: integer  - the number of nearly equal sized parts the
          sample set should be divided into (outer cross-validation)

cross.repeat: integer - the number of repetitions of the
          cross-validation procedure

cross.inner: integer - the number of nearly equal sized parts the train
          set should be divided into (inner cross-validation)

plot.label: name of one column of the phenodata- if specified, the
          content of this column will form the labels of the x-axis if
          the 'votematrix' will be plotted with plot.MCRestimate

    rand: integer - the random number generator will be put in a
          reproducible state

stratify: should a stratified version be used for the cross validation?

information: information - should classificator specific data be
          given(depends on the wrapper for the classification method)

_D_e_t_a_i_l_s:

     The argument 'thePreprocessingMethods' can be used instead of
     'variableSel.fun' and 'cluster.fun'. In the first versions of
     MCRestimate it was only possible to have one variable selection
     and one cluster functions. Now it is possible to have more than
     two functions and the ordering is arbitrary, e.g. you can have a
     variable selection function, then a cluster function and then a
     seccond variable selection function.

     If MCRestimate is used with an object of class 'exprSetRG-class',
     the preprocessing steps can use the green and the red channel
     seperately but the classification methods works with green channel
     - red channel.

     Note: 'correct prediction' means that a sample was predicted to be
     a member of the correct class at least as often as it was
     predicted to be a member of each other class. So in the two class
     problem a sample is also 'correct' if it has been predicted
     correctly half of the time.

_V_a_l_u_e:

     an object of class 'MCRestimate' which is a 'list' with fourteen
     arguments: 

   votes: 

 classes: the class of each sample

   table: 

correct.prediction: a logical vector - indicates if a sample was
          predicted to be a member of the correct class at least as
          often as it was predicted to be a member of each other class.

correct.class.vote: vector that contains for every sample the vote for
          it's correct class

parameter: a list consisting of the estimated 'best' parameter for each
          cross-validation part

class.method: string which names the function used for the
          classification

thePreprocessingMethods: character string - name of the preprocessing
          functions that have been used

cross.outer: number of blocks for a the outer cross-validation

cross.repeat: number of outer cross-validation repetitions

cross.inner: number of blocks for a the inner cross-validation

sample.names: names of the sample

information: classificator specific data (if information is TRUE)

_A_u_t_h_o_r(_s):

     Markus Ruschhaupt <URL: mailto:m.ruschhaupt@dkfz.de>,
     contributions from Patrick Warnat <URL:
     mailto:p.warnat@dkfz-heidelberg.de>

_E_x_a_m_p_l_e_s:

     library(MCRestimate)
     library(golubEsets)
     data(Golub_Test)
     G2 <- Golub_Test[1:500,]
     result <- MCRestimate(G2,"ALL.AML",classification.fun="RF.wrap",cross.outer=4,cross.repeat=3)
     result
     if (interactive()) {
       x11(width=9, height=4)}
     plot(result)

