getRelevantEGenes            package:nem            R Documentation

_A_u_t_o_m_a_t_i_c _s_e_l_e_c_t_i_o_n _o_f _m_o_s_t _r_e_l_e_v_a_n_t _E-_g_e_n_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     1. A-priori filtering of E-genes: Select E-genes, which show a
     pattern of differential expression across experiments that is
     expected to be non-random.  2. Automated E-gene subset selection:
     Select those E-genes, which have the highest likelihood under the
     given network hypothesis.

_U_s_a_g_e:

     filterEGenes(Porig, D, Padj=NULL, ntop=100, fpr=0.05, adjmethod="bonferroni", cutoff=0.05)

     getRelevantEGenes(Phi, D, para=NULL, hyperpara=NULL,Pe=NULL,Pm=NULL,lambda=0, delta=1, type="CONTmLLDens", nEgenes=min(10*nrow(Phi), nrow(D)))

_A_r_g_u_m_e_n_t_s:

     For method filterEGenes: 

   Porig: matrix of raw p-values, typically from the complete array

       D: data matrix. Columns correspond to the nodes in the silencing
          scheme. Rows are effect reporters. 

    Padj: matrix of false positive rates. If not, provided
          Benjamini-Hochbergs method for false positive rate
          computation is used.

    ntop: number of top genes to consider from each knock-down
          experiment

     fpr: significance cutoff for the FDR

adjmethod: adjustment method for pattern p-values

  cutoff: significance cutoff for patterns

     Phi: adjacency matrix with unit main diagonal 

    type: 'mLL' or 'FULLmLL' or 'CONTmLL' or 'CONTmLLBayes' or
          'CONTmLLMAP'. 'CONTmLLDens' and 'CONTmLLRatio' are identical
          to 'CONTmLLBayes' and 'CONTmLLMAP' and are still supported
          for compatibility reasons, see 'nem'.

    para: Vector with parameters 'a' and 'b' (for "mLL" with count
          data)

hyperpara: Vector with hyperparameters 'a0', 'b0', 'a1', 'b1' for
          "FULLmLL"

      Pe: prior position of effect reporters. Default: uniform over
          nodes in silencing scheme

      Pm: prior on model graph (n x n matrix) with entries 0 <=
          priorPhi[i,j] <= 1 describing the probability of an edge
          between gene i and gene j.

  lambda: regularization parameter to incorporate prior assumptions.

   delta: regularization parameter for automated E-gene subset
          selection (CONTmLLMAP only)

 nEgenes: no. of E-genes to select

_D_e_t_a_i_l_s:

     The method filterEGenes performs an a-priori filtering of the
     complete microarray. It determines how often E-genes are expected
     to be differentially expressed across experiments just randomly.
     According to this only E-genes are chosen, which show a pattern of
     differential expression more often than can be expected by chance.

     The method getRelevantEGenes looks for the E-genes, which have the
     highest likelihood under the given network hypothesis. In case of
     the scoring type "CONTmLLBayes" these are all E-genes which have a
     positive contribution to the total log-likelihood. In case of type
     "CONTmLLMAP" all E-genes not assigned to the "null" S-gene are
     returned. This involves the prior probability delta/no. S-genes
     for leaving out an E-gene. For all other cases ("CONTmLL",
     "FULLmLL", "mLL") the nEgenes E-genes with the highest likelihood
     under the given network hypothesis are returned.

_V_a_l_u_e:

       I: index of selected E-genes

     dat: subset of original data according to I 

patterns: significant patterns

nobserved: no. of cases per observed pattern

selected: selected E-genes

     mLL: marginal likelihood of a phenotypic hierarchy

     pos: posterior distribution of effect positions in the hierarchy

  mappos: Maximum a posteriori estimate of effect positions

LLperGene: likelihood per selected E-gene

_A_u_t_h_o_r(_s):

     Holger Froehlich

_S_e_e _A_l_s_o:

     'nem', 'score', 'mLL', 'FULLmLL'

_E_x_a_m_p_l_e_s:

        # Drosophila RNAi and Microarray Data from Boutros et al, 2002
        data("BoutrosRNAi2002")
        D <- BoutrosRNAiDiscrete[,9:16]

        # enumerate all possible models for 4 genes
        models <- enumerate.models(unique(colnames(D)))  
        
        getRelevantEGenes(models[[64]], D, para=c(.13,.05), type="mLL")

