getRelevantEGenes            package:nem            R Documentation

_A_u_t_o_m_a_t_i_c _s_e_l_e_c_t_i_o_n _o_f _m_o_s_t _r_e_l_e_v_a_n_t _S-_g_e_n_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     1. Selects those E-genes, which have the highest likelihood under
     the given network hypothesis. 2. Cluster E-genes and select one
     E-gene from each cluster to reduce the amount of data on the
     array.

_U_s_a_g_e:

     filterEGenes(Porig, D, ntop=100)

     getRelevantEGenes(Phi, D, nEgenes=min(5*ncol(Phi), nrow(D1)), type="mLL", para=NULL, hyperpara=NULL, Pe=NULL, Pm=NULL, lambda=0)

     selectEGenes(Phi,D1,D0=NULL,para=NULL,hyperpara=NULL,Pe=NULL,Pm=NULL,lambda=0,type="mLL", nEgenes=min(5*ncol(Phi), nrow(D1)))

_A_r_g_u_m_e_n_t_s:

   Porig: matrix of raw p-values, typically from the complete array

       D: data matrix. Columns correspond to the nodes in the silencing
          scheme. Rows are effect reporters. 

    ntop: number of top genes to consider from each knock-down
          experiment

     Phi: adjacency matrix with unit main diagonal 

 nEgenes: no. of E-genes to select

    type: (1.) marginal likelihood "mLL" (only for cout matrix D), or
          (2.) full marginal likelihood "FULLmLL" integrated over a and
          b and depending on hyperparameters a0, a1, b0, b1 (only for
          count matrix D), or (3.) "CONTmLL" marginal likelihood for
          probability matrices, or (4.) "CONTmLLDens" marginal
          likelihood for probability density matrices, or (5.)
          "CONTmLLRatio" for log-odds ratio matrices

    para: Vector with parameters 'a' and 'b' (for "mLL" with count
          data)

hyperpara: Vector with hyperparameters 'a0', 'b0', 'a1', 'b1' for
          "FULLmLL"

      Pe: prior position of effect reporters. Default: uniform over
          nodes in silencing scheme

      Pm: prior on model graph (n x n matrix) with entries 0 <=
          priorPhi[i,j] <= 1 describing the probability of an edge
          between gene i and gene j.

  lambda: regularization parameter to incorporate prior assumptions.

      D1: (i) count matrix for discrete data: phenotypes x genes. How
          often did we see an effect after interventions? (ii) matrix
          describing the probabilities of an effect (iii) probability
          density matrix discribing the strength of an effect

      D0: count matrix: phenotypes x genes. How often did we NOT see an
          effect after intervention? Not used for continious data

_D_e_t_a_i_l_s:

     uses 'mLL' or 'FULLmLL' to score each E-gene.

_V_a_l_u_e:

       I: index of selected E-genes

     mLL: marginal likelihood of a phenotypic hierarchy

     pos: posterior distribution of effect positions in the hierarchy

  mappos: Maximum aposteriori estimate of effect positions

_A_u_t_h_o_r(_s):

     Holger Froehlich

_S_e_e _A_l_s_o:

     'nem', 'score', 'mLL', 'FULLmLL', 'enumerate.models'

_E_x_a_m_p_l_e_s:

        # Drosophila RNAi and Microarray Data from Boutros et al, 2002
        data("BoutrosRNAi2002")
        D <- BoutrosRNAiDiscrete[,9:16]

        # enumerate all possible models for 4 genes
        models <- enumerate.models(unique(colnames(D)))  
        
        getRelevantEGenes(models[[64]], D, para=c(.13,.05))

