ebayes                 package:limma                 R Documentation

_E_m_p_i_r_i_c_a_l _B_a_y_e_s _S_t_a_t_i_s_t_i_c_s _f_o_r _D_i_f_f_e_r_e_n_t_i_a_l _E_x_p_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Given a series of related parameter estimates and standard errors,
     compute moderated t-statistics, moderated F-statistic, and
     log-odds of differential expression by empirical Bayes shrinkage
     of the standard errors towards a common value.

_U_s_a_g_e:

     ebayes(fit,proportion=0.01,stdev.coef.lim=c(0.1,4))
     eBayes(fit,proportion=0.01,stdev.coef.lim=c(0.1,4))

_A_r_g_u_m_e_n_t_s:

     fit: an 'MArrayLM' fitted model object produced by 'lmFit' or
          'contrasts.fit', or an unclassed list produced by
          'lm.series', 'gls.series' or 'mrlm' containing components
          'coefficients', 'stdev.unscaled', 'sigma' and 'df.residual'

proportion: numeric value between 0 and 1, assumed proportion of genes
          which are differentially expressed

stdev.coef.lim: numeric vector of length 2, assumed lower and upper
          limits for the standard deviation of log2 fold changes for
          differentially expressed genes

_D_e_t_a_i_l_s:

     These functions is used to rank genes in order of evidence for
     differential expression. They use an empirical Bayes method to
     shrink the probe-wise sample variances towards a common value and
     to augmenting the degrees of freedom for the individual variances
     (Smyth, 2004). The functions accept as input argument 'fit' a
     fitted model object from the functions 'lmFit', 'lm.series',
     'mrlm' or 'gls.series'. The fitted model object may have been
     processed by 'contrasts.fit' before being passed to 'eBayes' to
     convert the coefficients of the design matrix into an arbitrary
     number of contrasts which are to be tested equal to zero. The
     columns of 'fit' define a set of contrasts which are to be tested
     equal to zero.

     The empirical Bayes moderated t-statistics test each individual
     contrast equal to zero. For each probe (row), the moderated
     F-statistic tests whether all the contrasts are zero. The
     F-statistic is an overall test computed from the set of
     t-statistics for that probe. This is exactly analogous the
     relationship between t-tests and F-statistics in conventional
     anova, except that the residual mean squares and residual degrees
     of freedom have been moderated between probes.

     The estimates 's2.prior' and 'df.prior' are computed by
     'fitFDist'. 's2.post' is the weighted average of 's2.prior' and
     'sigma^2' with weights proportional to 'df.prior' and
     'df.residual' respectively. The 'lods' is sometimes known as the
     B-statistic. The F-statistics 'F' are computed by 'classifyTestsF'
     with 'fstat.only=TRUE'.

     'eBayes' doesn't compute ordinary (unmoderated) t-statistics by
     default, but these can be easily extracted from  the linear model
     output, see the example below.

     'ebayes' is the earlier and leaner function. 'eBayes' is intended
     to have a more object-orientated flavor as it produces objects
     containing all the necessary components for downstream analysis.

_V_a_l_u_e:

     'ebayes' produces an ordinary list with the following components.
     'eBayes' adds the following components to 'fit' to produce an
     augmented object, usually of class 'MArrayLM'. 

       t: numeric vector or matrix of moderated t-statistics

 p.value: numeric vector of p-values corresponding to the t-statistics

s2.prior: estimated prior value for 'sigma^2'

df.prior: degrees of freedom associated with 's2.prior'

 s2.post: vector giving the posterior values for 'sigma^2'

    lods: numeric vector or matrix giving the log-odds of differential
          expression

var.prior: estimated prior value for the variance of the
          log2-fold-change for differentially expressed gene

       F: numeric vector of moderated F-statistics for testing all
          contrasts defined by the columns of 'fit' simultaneously
          equal to zero

F.p.value: numeric vector giving p-values corresponding to 'F'

_A_u_t_h_o_r(_s):

     Gordon Smyth

_R_e_f_e_r_e_n_c_e_s:

     Lnnstedt, I. and Speed, T. P. (2002). Replicated microarray data.
     _Statistica Sinica_ *12*, 31-46.

     Smyth, G. K. (2004). Linear models and empirical Bayes methods for
     assessing differential expression in microarray experiments.
     _Statistical Applications in Genetics and Molecular Biology_, *3*,
     No. 1, Article 3. <URL:
     http://www.bepress.com/sagmb/vol3/iss1/art3>

_S_e_e _A_l_s_o:

     'squeezeVar', 'fitFDist', 'tmixture.matrix'.

     An overview of linear model functions in limma is given by
     06.LinearModels.

_E_x_a_m_p_l_e_s:

     #  See also lmFit examples

     #  Simulate gene expression data,
     #  6 microarrays and 100 genes with one gene differentially expressed
     set.seed(2004); invisible(runif(100))
     M <- matrix(rnorm(100*6,sd=0.3),100,6)
     M[1,] <- M[1,] + 1
     fit <- lmFit(M)

     #  Ordinary t-statistic
     par(mfrow=c(1,2))
     ordinary.t <- fit$coef / fit$stdev.unscaled / fit$sigma
     qqt(ordinary.t,df=fit$df.residual,main="Ordinary t")
     abline(0,1)

     #  Moderated t-statistic
     eb <- eBayes(fit)
     qqt(eb$t,df=eb$df.prior+eb$df.residual,main="Moderated t")
     abline(0,1)
     #  Points off the line may be differentially expressed
     par(mfrow=c(1,1))

