fdr2d                 package:OCplus                 R Documentation

_C_o_m_p_u_t_e _t_w_o-_d_i_m_e_n_s_i_o_n_a_l _l_o_c_a_l _f_a_l_s_e _d_i_s_c_o_v_e_r_y _r_a_t_e

_D_e_s_c_r_i_p_t_i_o_n:

     This function calculates the local false discovery rate for a
     two-sample problem using a bivariate test statistic, consisting of
     classical t-statistics and the corresponding logarithmized
     standard error.

_U_s_a_g_e:

     fdr2d(xdat, grp, test, p0, nperm = 100, nr = 15, seed = NULL, null = NULL,
               constrain = TRUE, smooth = 0.2, verb = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

    xdat: the matrix of expression values, with genes as rows and
          samples as columns

     grp: a grouping variable giving the class membership of each
          sample, i.e. each column in 'xdat'

    test: a function that takes 'xdat' and 'grp' as the first two
          arguments and returns the bivariate test statistics as
          two-column matrix; by default, two-sample t-statistics and
          logrithmized standard errors are calculated.

      p0: if supplied, an estimate for the proportion of
          non-differentially expressed genes; if not supplied, the
          routine will estimate it, see Details.

   nperm: number of permutations for establishing the null distribution
          of the t-statistic

      nr: the number of equidistant breaks for the range of each test
          statistic; fdr values are calculated on the resulting (nr-1)
          x (nr-1) grid of cells.

    seed: if specified, the random seed from which the permuations are
          started

    null: optional argument for passing in a pre-calculated null
          distribution, see Examples.

constrain: logical value indicating whether the estimated fdr should be
          constrained to be monotonously decreasing with the absolute
          size of the t-statistic (more generally, the first test
          statistic).

  smooth: a numerical value between 0.01 and 0.99, indicating which
          percentage of the available degrees of freedom are used for
          smoothing the fdr estimate; larger values indicate more
          smoothing.

    verb: logical value indicating whether provide extra information.

     ...: extra arguments to function 'test'.

_D_e_t_a_i_l_s:

     This routine computes a bivariate extension of the classical local
     false discovery rate as available through function 'fdr1d'.
     Consequently, many arguments have identical or similar meaning.
     Specifically for 'fdr2d', 'nr' specifies the number of equidistant
     breaks defining a two-dimensional grid of cells on which the
     bivariate test statistics are counted; argument 'constrain' can be
     set to ensure that the estimated fdr is decreasing with increasing
     absolute value of the t-statistic; and argument 'smooth' specifies
     the degree of smoothing when estimating the fdr.

     Note that while 'fdr2d' might be used for any suitable pair of
     test statistics, it has only been tested for the default pair, and
     the smoothing procedure specifically is optimized for this
     situation.

     Note also that the estimation of the proportion 'p0' directly from
     the data may be quite unstable and dependant on the degree of
     smoothing; too heavy smoothing may even lead to estimates greater
     than 1. It is usually more stable       use an estimate of 'p0'
     provided by 'fdr1d'.

     Note that 'fdr1d' can also be used to check the degree of
     smoothing, see 'average.fdr'.

_V_a_l_u_e:

     Basically, a data frame with one row per gene and three columns:
     'tstat', the test statistic, 'logse', the corresponding
     logarithmized standard error, and 'fdr.local', the local false
     discovery rate. This data frame has the additional class
     attributes 'fdr2d.result' and 'fdr.result', see Examples. This is
     the bad old S3 class mechanism employed to provide plot and
     summary functions. 

     Additional information is provided by a 'param' attribute, which
     is a list with the following entries: 

      p0: the proportion of non-differentially expressed genes used
          when calculating the fdr.

  p0.est: a logical value indicating whether 'p0' was estimated from
          the data or supplied by the user.

     fdr: the matrix of smoothed fdr values calculated on the original
          grid.

 xbreaks: vector of breaks for the first test statistic.

 ybreaks: vector of breaks for the second test statistic.

_A_u_t_h_o_r(_s):

     A Ploner and Y Pawitan

_R_e_f_e_r_e_n_c_e_s:

     Ploner A, Calza S, Gusnanto A, Pawitan Y (2005) Multidimensional
     local false discovery rate for micorarray studies. _Submitted
     Manuscript_.

_S_e_e _A_l_s_o:

     'plot.fdr2d.result', 'summary.fdr.result', 'OCshow', 'fdr1d',
     'average.fdr'

_E_x_a_m_p_l_e_s:

     # We simulate a small example with 5 percent regulated genes and
     # a rather large effect size
     set.seed(2000)
     xdat = matrix(rnorm(50000), nrow=1000)
     xdat[1:25, 1:25] = xdat[1:25, 1:25] - 1
     xdat[26:50, 1:25] = xdat[26:50, 1:25] + 1
     grp = rep(c("Sample A","Sample B"), c(25,25))

     # A default run
     res2d = fdr2d(xdat, grp)
     res2d[1:20,]

     # Looking at the results
     summary(res2d)
     plot(res2d)
     res2d[res2d$fdr<0.05, ]

     # Extra information
     class(res2d)
     attr(res2d,"param")

