GlobalAncova          package:GlobalAncova          R Documentation

_G_l_o_b_a_l _t_e_s_t _f_o_r _d_i_f_f_e_r_e_n_t_i_a_l _g_e_n_e _e_x_p_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Computation of a F-test for the association between expression
     values and clinical entities. In many cases a two way layout with
     gene and a dichotomous group as factors will be considered.
     However, adjustment for other covariates and the analysis of
     arbitrary clinical variables, interactions, gene co-expression,
     time series data and so on is also possible. The test is carried
     out by comparison of corresponding linear models via the extra sum
     of squares principle. Corresponding p-values, permutation p-values
     and/or asymptotic p-values are given.

     There are three possible ways of using 'GlobalAncova'. The general
     way is to define formulas for the full and reduced model,
     respectively, where the formula terms correspond to variables in
     'model.dat'. An alternative is to specify the full model and the
     name of the model terms that shall be tested regarding
     differential expression. In order to make this layout compatible
     with the function call in the first version of the package there
     is also a method where simply a group variable (and possibly
     covariate information) has to be given. This is maybe the easiest
     usage in cases where no 'special' effects like e.g. interactions
     are of interest.

_U_s_a_g_e:

     ## S4 method for signature 'matrix, formula, formula, ANY,
     ##   missing, missing, missing':
     GlobalAncova(xx, formula.full, formula.red, model.dat, 
               test.genes, method = c("permutation","approx","both","Fstat"), perm = 10000, max.group.size = 2500, eps = 1e-16, acc = 50)

     ## S4 method for signature 'matrix, formula, missing, ANY,
     ##   missing, missing, character':
     GlobalAncova(xx, formula.full, test.terms, model.dat, 
               test.genes, method = c("permutation","approx","both","Fstat"), perm = 10000, max.group.size = 2500, eps = 1e-16, acc = 50)

     ## S4 method for signature 'matrix, missing, missing,
     ##   missing, ANY, ANY, missing':
     GlobalAncova(xx, group, covars = NULL,   
               test.genes, method = c("permutation","approx","both","Fstat"), perm = 10000, max.group.size = 2500, eps = 1e-16, acc = 50)

_A_r_g_u_m_e_n_t_s:

      xx: Matrix of gene expression data, where columns correspond to
          samples and rows to genes. The data should be properly
          normalized beforehand (and log- or otherwise transformed).
          Missing values are not allowed. Gene and sample names can be
          included as the row and column names of 'xx'.

formula.full: Model formula for the full model.

formula.red: Model formula for the reduced model (that does not contain
          the terms of interest.)

model.dat: Data frame that contains all the variable information for
          each sample.

   group: Vector with the group membership information.

  covars: Vector or matrix which contains the covariate information for
          each sample.

test.terms: Character vector that contains names of the terms of
          interest.

test.genes: Vector of gene names or a list where each element is a
          vector of gene names.

  method: p-values can be calculated permutation-based
          ('"permutation"') or by means of an approximation for a
          mixture of chi-square  distributions ('"approx"'). Both
          p-values are provided when specifying 'method = "both"'. With
          option '"Fstat"' only the global F-statistics are returned
          without p-values or further information.

    perm: Number of permutations to be used for the permutation
          approach. The default is 10,000.

max.group.size: Maximum size of a gene set for which the asymptotic
          p-value is calculated.  For bigger gene sets the permutation
          approach is used.

     eps: Resolution of the asymptotic p-value.

     acc: Accuracy parameter needed for the approximation. Higher
          values indicate higher accuracy.

_V_a_l_u_e:

     If 'test.genes = NULL' a list with components 

  effect: Name(s) of the tested effect(s)

   ANOVA: ANOVA table

test.result: F-value, theoretical p-value, permutation-based and/or
          asymptotic p-value

   terms: Names of all model terms


     If a collection of gene sets is provided in 'test.genes' a matrix
     is returned whose columns show the number of genes, value of the 
     F-statistic, theoretical p-value, permutation-based and/or
     asymptotic p-value for each of the gene sets.

_M_e_t_h_o_d_s:

     _x_x = "_m_a_t_r_i_x", _f_o_r_m_u_l_a._f_u_l_l = "_f_o_r_m_u_l_a", _f_o_r_m_u_l_a._r_e_d = "_f_o_r_m_u_l_a", _m_o_d_e_l._d_a_t = "_A_N_Y", _g_r_o_u_p = "_m_i_s_s_i_n_g",  _c_o_v_a_r_s = "_m_i_s_s_i_n_g", _t_e_s_t._t_e_r_m_s = "_m_i_s_s_i_n_g" 
          In this method, besides the expression matrix 'xx', model
          formulas for the full and reduced model and a data frame
          'model.dat' specifying corresponding model terms have to be
          given. Terms that are included in the full but not in the
          reduced model are those whose association with differential
          expression will be tested. The arguments 'group', 'covars'
          and 'test.terms' are '"missing"' since they are not needed
          for this method.

     _x_x = "_m_a_t_r_i_x", _f_o_r_m_u_l_a._f_u_l_l = "_f_o_r_m_u_l_a", _f_o_r_m_u_l_a._r_e_d = "_m_i_s_s_i_n_g", _m_o_d_e_l._d_a_t = "_A_N_Y", _g_r_o_u_p = "_m_i_s_s_i_n_g",  _c_o_v_a_r_s = "_m_i_s_s_i_n_g", _t_e_s_t._t_e_r_m_s = "_c_h_a_r_a_c_t_e_r" 
          In this method, besides the expression matrix 'xx', a model
          formula for the full model and a data frame 'model.dat'
          specifying corresponding model terms are required. The
          character argument 'test.terms' names the terms of interest
          whose association with differential expression will be
          tested. The basic idea behind this method is that one can
          select single terms, possibly from the list of terms provided
          by previous 'GlobalAncova' output, and test them without
          having to specify each time a model formula for the reduced
          model. The arguments 'formula.red', 'group' and 'covars' are
          '"missing"' since they are not needed for this method.

     _x_x = "_m_a_t_r_i_x", _f_o_r_m_u_l_a._f_u_l_l = "_m_i_s_s_i_n_g", _f_o_r_m_u_l_a._r_e_d = "_m_i_s_s_i_n_g", _m_o_d_e_l._d_a_t = "_m_i_s_s_i_n_g",  _g_r_o_u_p = "_A_N_Y", _c_o_v_a_r_s = "_A_N_Y", _t_e_s_t._t_e_r_m_s = "_m_i_s_s_i_n_g" 
          Besides the expression matrix 'xx' a clinical variable
          'group' is required. Covariate adjustment is possible via the
          argument 'covars' but more complex models have to be
          specified with the methods described above. This method
          emulates the function call in the first version of the
          package. The arguments 'formula.full', 'formula.red',
          'model.dat' and 'test.terms' are '"missing"' since they are
          not needed for this method.

_N_o_t_e:

     This work was supported by the NGFN project 01 GR 0459, BMBF,
     Germany.

_A_u_t_h_o_r(_s):

     Reinhard Meister meister@tfh-berlin.de
      Ulrich Mansmann mansmann@ibe.med.uni-muenchen.de
      Manuela Hummel hummel@ibe.med.uni-muenchen.de 
      with contributions from Sven Knueppel

_R_e_f_e_r_e_n_c_e_s:

     Mansmann, U. and Meister, R., 2005, Testing differential gene
     expression in functional groups, _Methods Inf Med_ 44 (3).

_S_e_e _A_l_s_o:

     'Plot.genes', 'Plot.subjects', 'GlobalAncova.closed', 'GAGO',
     'GlobalAncova.decomp'

_E_x_a_m_p_l_e_s:

     data(vantVeer)
     data(phenodata)
     data(pathways)

     GlobalAncova(xx = vantVeer, formula.full = ~metastases + ERstatus, formula.red = ~ERstatus, model.dat = phenodata, test.genes=pathways[1], method="both", perm = 100)
     GlobalAncova(xx = vantVeer, formula.full = ~metastases + ERstatus, test.terms = "metastases", model.dat = phenodata, test.genes=pathways[1], method="both", perm = 100)
     GlobalAncova(xx = vantVeer, group = phenodata$metastases, covars = phenodata$ERstatus, test.genes=pathways[1], method="both", perm = 100)

