roast                 package:limma                 R Documentation

_r_o_a_s_t

_D_e_s_c_r_i_p_t_i_o_n:

     Rotation gene set testing for linear models.

_U_s_a_g_e:

     roast(iset=NULL,y,design,contrast=ncol(design),gene.weights=NULL,array.weights=NULL,block=NULL,correlation,var.prior=NULL,df.prior=NULL,nrot=1000)

_A_r_g_u_m_e_n_t_s:

    iset: vector specifying the rows of 'y' in the test set.  This can
          be a vector of indices, or a logical vector of the same
          length as 'statistics', or any vector such as 'y[selected,]'
          contains the values for the gene set to be tested.

       y: numeric matrix giving log-expression values. If 'var.prior'
          or 'df.prior' are null, then 'y' should contain values for
          all genes on the arrays. If both prior parameters are given,
          then only 'y' values for the test set are required.

  design: design matrix

contrast: contrast for which the test is required. Can be an integer
          specifying a column of 'design', or else a contrast vector of
          length equal to the number of columns of 'design'.

gene.weights: optional numeric vector of weights for genes in the set.

array.weights: optional numeric vector of array weights.

   block: optional vector of blocks.

correlation: correlation between blocks.

var.prior: prior value for residual variances. If not provided, this is
          estimated from all the data using 'squeezeVar'.

df.prior: prior degrees of freedom for residual variances. If not
          provided, this is estimated using 'squeezeVar'.

    nrot: number of rotations used to estimate the p-values.

_D_e_t_a_i_l_s:

     This function tests whether any of the genes in the set are
     differentially expressed. It uses rotation, which is a smoothed
     version of permutation suitable for linear models (Langsrud,
     2005). It can be used for any linear model with replication. and
     negative values and otherwise will be taken to be F-like.

     This is a self-contained test is the sense that genes outside the
     test set do not play a role (Goeman, JJ, and Buhlmann P, 2007). A
     competitive gene set test is performed by 'geneSetTest'.

     p-values are given for four possible alternative hypotheses.
     'alternative=="up"' means the genes in the set tend to be
     up-regulated, with positive t-statistics. 'alternative=="down"'
     means the genes in the set tend to be down-regulated, with
     negative t-statistics. 'alternative=="either"' means the set is
     either up or down-regulated as a whole. 'alternative=="mixed"'
     test whether the genes in the set tend to be differentially
     expressed, without regard for direction. In this case, the test
     will be significant if the set contains mostly large test
     statistics, even if some are positive and some are negative.

     The first three alternatives are appropriate if you have a prior
     expection that all the genes in the set will react in the same
     direction. The '"mixed"' alternative is appropriate if you know
     only that the genes are involved in the relevant pathways, without
     knowing the direction of effect for each gene. The '"mixed"'
     alternative is the only one possible with F-like statistics.

     Note that 'roast' estimates p-values by simulation, specifically
     by random rotations of the orthogonalized residuals. This means
     that the p-values will vary slightly from run to run. To get more
     precise p-values, increase the number of rotations 'nrot'. The
     strategy of random rotations is due to Langsrud (2005).

_V_a_l_u_e:

     data.frame with columns 'Z', 'Active' and 'P.Value'. The 'Z'
     column gives average (root mean square) z-statistics for the genes
     in the set. The 'Active' gives the proportion of genes in the set
     contributing meaningfully to significance, defined as those with
     squared z-values greater than 2. The 'P.Value' gives estimated
     p-values. The rows correspond to the alternative hypotheses mixed,
     up, down or either.

_A_u_t_h_o_r(_s):

     Gordon Smyth and Di Wu

_R_e_f_e_r_e_n_c_e_s:

     Goeman, JJ, and Buhlmann P, 2007. Analyzing gene expression data
     in terms of gene sets: methodological issues. _Bioinformatics_ 23,
     980-987. 

     Langsrud, O. (2005). Rotation tests. _Statistics and Computing_
     15, 53-60

_S_e_e _A_l_s_o:

     'geneSetTest'

_E_x_a_m_p_l_e_s:

     y <- matrix(rnorm(100*4),100,4)
     design <- cbind(Intercept=1,Group=c(0,0,1,1))
     iset <- 1:5
     y[iset,3:4] <- y[iset,3:4]+3
     roast(iset,y,design,contrast=2)

     # Alternative approach useful if multiple gene sets are tested:
     fit <- lmFit(y,design)
     sv <- squeezeVar(fit$sigma^2,df=fit$df.residual)
     iset1 <- 1:5
     iset2 <- 6:10
     roast(y=y[iset1,],design=design,contrast=2,var.prior=sv$var.prior,df.prior=sv$var.prior)
     roast(y=y[iset2,],design=design,contrast=2,var.prior=sv$var.prior,df.prior=sv$var.prior)

