nudge2                 package:nudge                 R Documentation

_F_u_n_c_t_i_o_n _f_o_r _n_o_r_m_a_l_i_z_i_n_g _d_a_t_a, _f_i_t_t_i_n_g _a _n_o_r_m_a_l-_u_n_i_f_o_r_m _m_i_x_t_u_r_e _a_n_d _e_s_t_i_m_a_t_i_n_g _p_r_o_b_a_b_i_l_i_t_i_e_s _o_f _d_i_f_f_e_r_e_n_t_i_a_l _e_x_p_r_e_s_s_i_o_n _i_n _t_h_e _c_a_s_e _w_h_e_r_e _t_h_e _t_w_o _s_a_m_p_l_e_s _a_r_e _b_e_i_n_g _c_o_m_p_a_r_e_d _i_n_d_i_r_e_c_t_l_y _t_h_r_o_u_g_h _a _c_o_m_m_o_n _r_e_f_e_r_e_n_c_e _s_a_m_p_l_e

_D_e_s_c_r_i_p_t_i_o_n:

     After a mean and variance normalization a two component mixture
     model is fitted to the data. The normal component represents the
     genes that are not differentially expressed and the uniform
     component represents the genes that are differentially expressed.
     Posterior probabilities for differential expression are computed
     from the fitted model.

_U_s_a_g_e:

     nudge2(control.logratio, txt.logratio, control.logintensity, txt.logintensity,
     span1 = 0.2, quant = 0.99, z = NULL, tol = 0.00001,iterlim=500)

_A_r_g_u_m_e_n_t_s:

control.logratio: A multiple-column matrix of replicates of log (base
          2) ratios of gene expressions for the control versus
          reference slides.

txt.logratio: A multiple-column matrix of replicates of log (base 2)
          ratios of gene expressions for the treatment versus reference
          slides.

control.logintensity: A multiple-column matrix of replicates of log
          (base 2) total intensities (defined as the product) of gene
          expressions for the control versus reference slides.

txt.logintensity: A multiple-column matrix of replicates of log (base
          2) total intensities (defined as the product) of gene
          expressions for the treatment versus reference slides.

   span1: Proportion of data used to fit the loess regression of the
          (average-across-replicates) log ratio differences on the
          (average-across-replicates) log intensities for the mean
          normalization.

   quant: Quantile to be used from the distribution of standard
          deviations of log ratio differences across replicates for all
          genes whose standard deviation was smaller than their
          absolute (mean normalized) average-across-replicates log
          ratio difference.

       z: An optional 2-column matrix with each row giving a starting
          estimate for the probability of the gene (in the
          corresponding row of the log ratio matrix/vector) not being
          differentially expressed and a starting estimate for the
          probability of the gene being differentially expressed. Each
          row should add up to 1.

     tol: A scalar tolerance for relative convergence of the
          loglikelihood.

 iterlim: The maximum number of iterations the EM is run for.

_V_a_l_u_e:

     A list including the following components 

  pdiff : A vector with the estimated posterior probabilities of being
          in the group of differentially expressed genes.

 lRnorm : A vector with the normalized (average-across-replicates) log
          ratio differences.

     mu : The estimated mean of the group of genes that are not
          differentially expressed.

  sigma : The estimated variance of the group of genes that are not
          differentially expressed.

 mixprob: The prior/mixing probability of a gene being in the group of
          genes that are not differentially expressed.

      a : The minimum value of the normalized data.

      b : The maximum value of the normalized data.

loglike : The log likelihood for the fitted mixture model.

   iter : The number of iterations run by the EM algorithm until either
          convergence or iteration limit was reached.

_A_u_t_h_o_r(_s):

     N. Dean and A. E. Raftery

_R_e_f_e_r_e_n_c_e_s:

     N. Dean and A. E. Raftery (2005). Normal uniform mixture
     differential gene expression detection for cDNA microarrays.  BMC
     Bioinformatics. 6, 173-186. 

     <URL: http://www.biomedcentral.com/1471-2105/6/173>

     S. Dudoit, Y. H. Yang, M. Callow and T. Speed (2002). Statistical
     methods for identifying differentially expressed genes in
     replicated cDNA microarray experiments. Stat. Sin. 12, 111-139.

_S_e_e _A_l_s_o:

     'nudge1','norm2c','norm2d','norm1a','norm1b','norm1c','norm1d'

_E_x_a_m_p_l_e_s:

     apo<-read.csv("http://www.stat.berkeley.edu/users/terry/zarray/Data/ApoA1/rg_a1ko_morph.txt",
     header=TRUE)
     rownames(apo)<-apo[,1]
     apo<-apo[,-1]
     apo<-apo+1

     lRctl<-log(apo[,c(seq(2,16,2))],2)-log(apo[,c(seq(1,15,2))],2)
     lRtxt<-log(apo[,c(seq(18,32,2))],2)-log(apo[,c(seq(17,31,2))],2)
     lIctl<-log(apo[,c(seq(2,16,2))],2)+log(apo[,c(seq(1,15,2))],2)
     lItxt<-log(apo[,c(seq(18,32,2))],2)+log(apo[,c(seq(17,31,2))],2)
      
     result<-nudge2(lRctl,lRtxt,lIctl,lItxt)

