aveProbe                package:pcot2                R Documentation

_T_r_a_n_s_f_o_r_m _A_f_f_y_m_e_t_r_i_x _d_a_t_a _s_o _t_h_a_t _u_n_i_q_u_e _g_e_n_e_s _w_i_t_h _m_u_l_t_i_p_l_e
_p_r_o_b_e_s _a_r_e _r_e_p_r_e_s_e_n_t_e_d _b_y _a _s_i_n_g_l_e _e_x_p_r_e_s_s_i_o_n _v_a_l_u_e _o_n _e_a_c_h _a_r_r_a_y.

_D_e_s_c_r_i_p_t_i_o_n:

     In Affymetrix gene expression data, a unique gene can often link
     to multiple probe sets, with such genes then having a greater
     influence on the analysis (particularly if the gene is
     differentially expressed).  To overcome this problem the median is
     taken across all probes sets which represent a unique gene.

_U_s_a_g_e:

     aveProbe(x, imat = NULL, ids)

_A_r_g_u_m_e_n_t_s:

       x: A matrix with no missing values; Each row represents a gene
          and each column represents a sample. 

    imat: A matrix indicating presence or absence of genes in the gene
          sets. The indicator matrix contains rows representing gene
          identifiers of genes present in the expression data and
          columns representing group (gene set) names. 

     ids: A vector of identifiers (e.g., UniGene or LocusLink
          identifiers) representing unique genes which match to the
          probe ids in the expression data. 

_V_a_l_u_e:

    newx: A data matrix with rows representing the input identifiers
          and columns representing samples.

 newimat: A new imat (indicator matrix) with rows representing the
          unique gene identifiers and columns representing gene sets. 

_A_u_t_h_o_r(_s):

     Sarah Song and Mik Black

_S_e_e _A_l_s_o:

     'pcot2','corplot','corplot2'

_E_x_a_m_p_l_e_s:

     library(multtest)
     library(hu6800.db)  
     data(golub)
     rownames(golub) <- golub.gnames[,3]
     colnames(golub) <- golub.cl
     KEGG.list <- as.list(hu6800PATH)
     imat <- getImat(golub, KEGG.list, ms=10) 
     colnames(imat) <- paste("KEGG", colnames(imat), sep="")

     pathlist <- as.list(hu6800PATH)
     pathlist <- pathlist[match(rownames(golub), names(pathlist))]
     ids <- unlist(mget(names(pathlist), env=hu6800SYMBOL))
     #### transform data matrix only ####
     newdat <- aveProbe(x=golub, ids=ids)$newx
     #### transform both data and imat ####
     output <- aveProbe(x=golub, imat=imat, ids=ids)
     newdat <- output$newx
     newimat <- output$newimat
     newimat <- newimat[,apply(newimat, 2, sum)>=10]

