FDR                  package:OCplus                  R Documentation

_C_o_m_p_u_t_e _F_D_R _f_o_r _g_e_n_e_r_a_l _s_c_e_n_a_r_i_o_s

_D_e_s_c_r_i_p_t_i_o_n:

     'FDR' computes the false discovery rate for comparing gene
     expression  between two groups of subjects when the distribution
     of the test statistic  under the null and alternative hypothesis
     are both mixtures of t-distributions. 'CDF' and 'CDFmix' calculate
     these mixtures.

_U_s_a_g_e:

     FDR(x, n1, n2, pmix, D0, p0, D1, p1, sigma)

     CDF(x, n1, n2, D, p, sigma)
     CDFmix(x, n1, n2, pmix, D0, p0, D1, p1, sigma)

     FDR.paired(x, n, pmix, D0, p0, D1, p1, sigma)

     CDF.paired(x, n, D, p, sigma)
     CDFmix.paired(x, n, pmix, D0, p0, D1, p1, sigma)

_A_r_g_u_m_e_n_t_s:

       x: vector of quantiles (two-sample t-statistics)

n, n1, n2: vector of sample sizes (as subjects per group)

    pmix: the proportion of non-differentially expressed genes

      D0: vector of effect sizes for the null distribution

      p0: vector of mixing proportions for 'D0'; must be the  same
          length as 'D0' and sum to one

      D1: vector of effect sizes for the alternative distribution

      p1: vector of mixing proportions for 'D1', same as  'p0'

    D, p: generic vectors of effect sizes and mixing proportions as
          above

   sigma: the standard deviation

_D_e_t_a_i_l_s:

     These functions are designed for a simple experimental setup,
     where we wish to compare gene expression between two groups of
     subjects of size 'n1' and 'n2' for an unspecified number of genes,
     using an equal-variance  t-statistic. 

     100'pmix'% of the genes are assumed to be not differentially
     expressed. The corresponding t-statistics follow a mixture of
     t-distributions; this is more general than the usual central
     t-distribution, because we may want to include genes with
     biologically small effects under the null hypothesis (Pawitan et
     al., 2005). The other 100(1-'pmix')% genes are assumed to be
     differentially expressed; their t-statistics are also mixtures of
     t-distributions. 

     The mixture proportions of t-distributions under the null and
     alternative hypothesis are specified via 'p0' and 'p1',
     respectively. The individual t-distributions are specified via the
     means 'D0' and 'D1' and the standard deviation 'sigma' of the
     underlying data (instead of the mathematically more obvious, but
     less intuitive non centrality parameters). As the underlying data
     are the logarithmized expression values, 'D0' and 'D1' can be
     interpreted as average log-fold change between conditions,
     measured in units of 'sigma'. See Examples.

     'CDF' computes the cumulative distribution function for a mixture
     of t-distributions based on means 'D' and standard deviation
     'sigma' with mixture proportions 'p'. This function is the work
     horse for 'CDFmix'.

     Note that the base functions ('FDR', 'CDFmix', 'CDF') assume two
     groups of experimental units; the '.paired' functions provide the
     same functionality for one group of paired observations. 

     The distribution functions call 'pt' for computation;
     correspondingly, the quantiles 'x' and all arguments that define
     degrees of freedom and non centrality parameters ('n1', 'n2',
     'D0', 'D1', 'sigma') can be vectors, and will be recycled as
     necessary.

_V_a_l_u_e:

     The appropriate vector of FDRs or probabilities.

_A_u_t_h_o_r(_s):

     Y. Pawitan and A. Ploner

_R_e_f_e_r_e_n_c_e_s:

     Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A. (2005)
     False Discovery Rate, Sensitivity and Sample Size for Microarray
     Studies. _Bioinformatics_, 21, 3017-3024.

_S_e_e _A_l_s_o:

     'TOC', 'samplesize'

_E_x_a_m_p_l_e_s:

     # FDR for H0: 'log fold change is zero'
     #     vs. H1: 'log fold change is -1 or 1' 
     #             (ie two-fold up- or down regulation) 
     FDR(1:6, n1=10, n2=10, pmix=0.90, D0=0, p0=1, 
         D1=c(-1,1), p1=c(0.5, 0.5), sigma=1)

     # Include small log fold changes in the H0
     # Naturally, this increases the FDR
     FDR(1:6, n1=10, n2=10, pmix=0.90, D0=c(-0.25,0, 0.25), p0=c(1/3,1/3,1/3), 
         D1=c(-1,1), p1=c(0.5, 0.5), sigma=1)

     # Consider an asymmetric alternative
     # 10 percent of the regulated genes are assumed to be four-fold upregulated
     FDR(1:6, n1=10, n2=10, pmix=0.90, D0=0, p0=1, 
         D1=c(-1,1,2), p1=c(0.45, 0.45, 0.1), sigma=1)

