pcot2                 package:pcot2                 R Documentation

_P_r_i_n_c_i_p_a_l _C_o_o_r_d_i_n_a_t_e_s _a_n_d _H_o_t_e_l_l_i_n_g'_s _T-_S_q_u_a_r_e

_D_e_s_c_r_i_p_t_i_o_n:

     The 'pcot2' function implements the PCOT2 testing method, which is
     a two-stage permutation-based approach for testing changes in
     activity in pre-specified gene sets.

_U_s_a_g_e:

     pcot2(emat, class = NULL, imat, permu = "ByColumn", iter = 1000, alpha = 0.05, adjP.method = "BY", var.equal = TRUE, ncomp = 2, dist.method = "euclidean")

_A_r_g_u_m_e_n_t_s:

    emat: A gene expression matrix with no missing values; Each row
          represents a gene and each column represents a sample. 

   class: Class labels representing two distinct experimental
          conditions (e.g., normal and disease). 

    imat: The gene category indicator matrix indicates presence or
          absence of genes in pre-defined gene sets (e.g., gene
          pathways). The indicator matrix contains rows representing
          gene identifiers of genes present in the expression data and
          columns representing pre-defined group names. A value of 1
          indicates the presence of a gene and 0 indicates the absence
          for the gene in a particular group.

   permu: Specifies whether genes or samples are permuted.  By default,
          permutations are performed by sample ("ByColumn").

    iter: The number indicates how many permutations will be performed
          in the analysis. 

   alpha: alpha determines the significance threshold for the
          permutation p-values. 

adjP.method: Specifies that p-values be adjusted by one of the
          following methods: "bonferroni", "holm", "hochberg",
          "hommel", "BH" (Benjamini and Hochberg), or "BY"  (Benjamini
          and Yekutieli).

var.equal: Specifies the use of either a pooled estimate of correlation
          for the two classes or an unpooled estimate for calculating
          each T-squared statistic. By default, the pooled estimate is
          used.

   ncomp: The dimensionality to which the data matrix is reduced via
          principal coordinates. The default dimensionality is set as
          'ncomp=2'.

dist.method: Specifies the method for calculating distance in the PCO
          procedure.  The available distance methods are "euclidean",
          "maximum", "manhattan", "canberra", "binary",
          "pearson","correlation" or "spearman". For additional details
          see the 'amap' package and the help documentation for the
          'Dist' function.

_D_e_t_a_i_l_s:

     The raw permutation p-values are adjusted for multiple testing by
     a call to 'p.adjust'.

_V_a_l_u_e:

 res.all: A data frame which prints information for all pathways

 res.sig: A data frame which prints information for significant
          pathways at a given alpha level

comparison: Print the contrast used in the analysis

     ...

_A_u_t_h_o_r(_s):

     Sarah Song and Mik Black

_S_e_e _A_l_s_o:

     'corplot','corplot2','aveProbe'

_E_x_a_m_p_l_e_s:

     ns <- 40  ## 40 samples
     cla <- rep(c("Trt","Ctr"),each=ns/2)
     ngene <- 10  ## 10 genes per group 
     npath <- 10  ## 10 groups

     nreal <- 3  ## alter groups ##
     nnull <- npath-nreal   ## null groups ##
     pname <- c(paste("RealP",1:nreal, sep=""), paste("NullP",1:nnull, sep=""))

     ## Three main inputs in the function ##
     ## [1] Simulate (gene) expression matrix (emat) ##
     rmv <- function(mn, covm, nr, nc){
        sigma <- diag(nr)
        sigma[sigma==0] <- covm
        x1 <- rmvnorm(nc/2, mean=mn, sigma=sigma)
        x0 <- rmvnorm(nc/2, mean=rep(0,nr), sigma=sigma)
        mat <- t(rbind(x1,x0))
       return(mat)
     }

     covm <- 0.9  ##covariance 
     ct <- c(6,8,10)  ##mean

     library(mvtnorm)
     emat <- c()
     for (i in 1:nreal) emat <- rbind(emat, rmv(rep(ct[i],ngene),covm=covm, ngene, ns))  # for alt pathways
     for (i in 1:(npath-nreal)) emat <- rbind(emat, rmv(mn=rep(0,ngene),covm=covm, nr=ngene, nc=ns))
     dimnames(emat) <- list(paste("Gene", 1:(ngene*npath),sep=""), cla)

     ## [2] class label ##
     cla

     ## [3] indicator matrix (row: genes and col: pathways)
     imat <- kronecker(diag(npath),rep(1,ngene))
     dimnames(imat) <- list(paste("Gene",1:(ngene*npath), sep=""), pname)

     results.pcot2 <- pcot2(emat, cla, imat)
     results.pcot2$res.sig
     results.pcot2$res.all

