printTopGenes        package:iterativeBMAsurv        R Documentation

_W_r_i_t_e _a _t_r_a_i_n_i_n_g _s_e_t _i_n_c_l_u_d_i_n_g _t_h_e _t_o_p-_r_a_n_k_e_d _G _v_a_r_i_a_b_l_e_s _f_r_o_m _a _s_o_r_t_e_d _m_a_t_r_i_x _t_o _f_i_l_e

_D_e_s_c_r_i_p_t_i_o_n:

     This function takes a matrix of rank-ordered variables and writes
     a training set containing the top G variables in the matrix to
     file.

_U_s_a_g_e:

     printTopGenes (retMatrix, numGlist=c(10, 30, 50, 100, 500, 1000, ncol(trainData)), trainData, myPrefix="sorted_topCoxphGenes_")

_A_r_g_u_m_e_n_t_s:

retMatrix: A three-column matrix where the first column contains the
          sorted variable  names (the top log-ranked variable appears
          first), the second column contains the original index of the
          variables, and the third column contains the variable ranking
          from 1 to ncol(trainData).

numGlist: A list of values for the desired number of top-ranked
          variables to be written to file. A separate file will be
          written for each number  G in the list, containing genes 1:G
          (default = c(10, 30, 50,  100, 500, 1000, ncol(trainData))).

trainData: Data matrix where columns are variables and rows are
          observations. In the case of gene expression data, the
          columns (variables)  represent genes, while the rows
          (observations) represent patient  samples.

myPrefix: A string prefix for the filename (default =
          'sorted_topCoxphGenes_').

_D_e_t_a_i_l_s:

     This function is called by 'iterateBMAsurv.train.predict.assess'.
     It is meant to be used in conjunction with 'singleGeneCoxph', as
     the 'retMatrix'  argument is returned by 'singleGeneCoxph'.

_V_a_l_u_e:

     A file or files consisting of the training data sorted in
     descending order by the top-ranked G variables (one file for each
     G in numGList).

_R_e_f_e_r_e_n_c_e_s:

     Annest, A., Yeung, K.Y., Bumgarner, R.E., and Raftery, A.E.
     (2008). Iterative Bayesian Model Averaging for Survival Analysis.
     Manuscript in Progress.

     Raftery, A.E. (1995).  Bayesian model selection in social research
     (with Discussion). Sociological Methodology 1995 (Peter V.
     Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells.

     Volinsky, C., Madigan, D., Raftery, A., and Kronmal, R. (1997)
     Bayesian Model Averaging in Proprtional Hazard Models: Assessing
     the Risk of a Stroke.  Applied Statistics 46: 433-448.

     Yeung, K.Y., Bumgarner, R.E. and Raftery, A.E. (2005)  Bayesian
     Model Averaging: Development of an improved multi-class, gene
     selection and classification tool for microarray data. 
     Bioinformatics 21: 2394-2402.

_S_e_e _A_l_s_o:

     'iterateBMAsurv.train.predict.assess',   'singleGeneCoxph',
     'trainData', 'trainSurv',  'trainCens',

_E_x_a_m_p_l_e_s:

     library(BMA)
     library(iterativeBMAsurv)
     data(trainData)
     data(trainSurv)
     data(trainCens)

     ## Start by ranking and sorting the genes; in this case we use the Cox Proportional Hazards Model
     sorted.genes <- singleGeneCoxph(trainData, trainSurv, trainCens)

     ## Write top 100 genes to file
     sorted.top.genes <- printTopGenes(retMatrix=sorted.genes, 100, trainData)

     ## The file, 'sorted_topCoxphGenes_100', is now in the working R directory.

