ebam                package:siggenes                R Documentation

_E_m_p_i_r_i_c_a_l _B_a_y_e_s _A_n_a_l_y_s_i_s _o_f _M_i_c_r_o_a_r_r_a_y_s

_D_e_s_c_r_i_p_t_i_o_n:

     Performs an Empirical Bayes Analysis of Microarrays for a
     specified value of the fudge factor a0. Modified versions of the t
     statistics are used.

_U_s_a_g_e:

         ebam(a0.out,a0=NA,p0=NA,delta=NA,local.bin=.1,gene.names=NULL,q.values=TRUE,
         R.fold=TRUE,R.unlog=TRUE,na.rm=FALSE,file.out=NA)

_A_r_g_u_m_e_n_t_s:

  a0.out: the object to which the output of a previous analysis with
          'find.a0' was assigned.

      a0: the fudge factor. If 'NA', the value suggested by 'find.a0'
          will be used.

      p0: prior probability that a gene is differentially expressed. If
          not specified (i.e. 'NA'), it will automatically be computed.

   delta: a gene will be called differentially expressed, if its
          posterior probability of being differentially expressed is
          larger than or equal to 'delta'. By default, the same 'delta'
          is used as in 'find.a0'.

local.bin: specifies the interval used in the estimation of the local
          FDR for the expression score z. By default, this interval is
          [z-0.1,z+0.1].

gene.names: a vector containing the names of the genes

q.values: if 'TRUE' (default), the q-value for each gene will be
          computed.

  R.fold: if 'TRUE' (default), the fold change for each differentially
          expressed gene will be computed.

 R.unlog: if 'TRUE', the anti-log of 'data' will be used in the
          computation of the R.fold. This is recommend if 'data'
          contains the log2 transformed gene expression levels.

   na.rm: if 'FALSE' (default), the fold change of genes with at least
          one missing value will be set to 'NA'. If 'TRUE', missing
          values will be replaced by the genewise mean.

file.out: if specified, general information like the number of
          significant  genes and the estimated FDR and gene-specific
          information like the expression scores, the q-values, the R
          fold etc. of the differentially expressed genes are stored in
          this file.

_V_a_l_u_e:

     a plot of the expression scores against their posterior
     probability of being differentially expressed, and (optional) a
     file containing general information like the estimated FDR and the
     number of differentially expressed genes and  gene-specific
     information about the differentially expressed genes like their
     names, their expression scores, q values and their fold changes.

     FDR: vector containing the estimated p0, the number of significant
          genes, the number of falsely called genes and the estimated
          FDR.

ebam.out: table containing gene-specific information about the
          differentially expressed genes.

row.sig.genes: vector consisting of the row numbers that belong to the
          differentially expressed genes.

     ...: further objects containing additional information

_N_o_t_e:

     The number of false positives are computed by p0 times the number
     of falsely called genes.

_A_u_t_h_o_r(_s):

     Holger Schwender, holger.schw@gmx.de

_R_e_f_e_r_e_n_c_e_s:

     Efron, B., Tibshirani, R., Storey, J.D., and Tusher, V. (2001).
     Empirical Bayes Analysis of a Microarray Experiment, _JASA_, 96,
     1151-1160.

     Storey, J.D., and Tibshirani, R. (2003). Statistical significance
     for genome-wide experiments, _Technical Report_, Department of
     Statistics, Stanford University.

     Schwender, H. (2003). Assessing the false discovery rate in a
     statistical analysis of gene expression data, Chapter 7, _Diploma
     thesis_, Department of Statistics, University of Dortmund, <URL:
     http://de.geocities.com/holgerschw/thesis.pdf>.

_S_e_e _A_l_s_o:

     'find.a0'   'ebam.wilc'

_E_x_a_m_p_l_e_s:

     ## Not run: 
         library(multtest)
         # Load the data of Golub et al. (1999). data(golub) contains 
         # a 3051x38 gene expression matrix called golub, a vector of
         # length called golub.cl that consists of the 38 class labels,
         # and a matrix called golub.gnames whose third column contains
         # the gene names.
         data(golub)
         
         # The optimal value for the fudge factor a0 is computed, where
         # possible values of the a0 are 0 and the 0, 0.05 and 0.1 quantile
         # of the standard deviations of the genes. Setting rand=123
         # makes the results reproducible.
         
         find.out<-find.a0(golub,golub.cl,alpha=c(0,0.05,0.1),rand=123)
         
         # Now that we have find the optimal value of a0, an empirical Bayes
         # analysis can be performed.
         
         ebam.out<-ebam(find.out,gene.names=golub.gnames[,3])
         
         # For further analyses the row numbers of the differentially expressed
         # genes are obtained by
         
         ebam.out$row.sig.genes
     ## End(Not run)

