copaPerm                package:copa                R Documentation

_M_e_a_s_u_r_e _S_i_g_n_i_f_i_c_a_n_c_e _o_f _C_O_P_A _b_y _P_e_r_m_u_t_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     This function can be used to determine the significance of the
     results that one gets from running 'copa' on a particular dataset,
     based on permuting the class assignments.

_U_s_a_g_e:

     copaPerm(object, copa, outlier.num, gene.pairs, B = 100, pval = FALSE, verbose = TRUE)

_A_r_g_u_m_e_n_t_s:

  object: An 'exprSet', 'ExpressionSet', or a matrix or 'data.frame'.

    copa: An object of class 'copa', produced by running 'copa' on a
          set of microarray data.

outlier.num: The number of outliers to test for. See details for more
          information

gene.pairs: The number of gene pairs to test for. See details for more
          information

       B: The number of permutations to perform. Defaults to 100. This
          may be too many for interactive use.

    pval: Boolean. Output an estimated p-value and false discovery
          rate? Defaults to 'FALSE'. This result will only be
          reasonable for large numbers of permutations (500 - 1000).
          See details.

 verbose: Boolean. Print out the permutation number at each of 100,
          200, etc. Defaults to 'TRUE'

_D_e_t_a_i_l_s:

     Running 'copa' on a set of microarray data will result in the
     output of an object of class 'copa', which is a list containing
     (among other things) an ordered vector that lists the number of
     mutually exclusive outlier samples for various gene pairs. This
     vector is ordered from smallest to largest following the
     assumption that the gene pairs with the most mutually exclusive
     outliers are probably more likely to be involved in some sort of
     recurrent fusion.

     One can see how many pairs of genes resulted in a given number of
     outliers by calling 'tableCopa'. One may then want to determine
     how significant a certain number of pairs is (e.g., how likely is
     it to get that many pairs if there is no recurrent fusion
     occuring). The most straightforward way to estimate the
     significance of a given result is to repeatedly permute the
     classlabels and see how many times one gets a result as large or
     larger than what was observed.

     Technically speaking, to get a reasonable estimate of significance
     and a false discovery rate, one would need to permute 500 - 1000
     times. However, this can take an inordinate amount of time (best
     left for an overnight run). To get a quick idea of significance,
     one could simply permute maybe 10 times (with pval = FALSE) to see
     how likely it is to get a certain number of outliers.

_V_a_l_u_e:

     out: A vector listing the number of gene pairs with at least as
          many outliers as 'num.outlier'.

 p.value: A permuted p-value, only output if pval = TRUE. Note that the
          size of the p-value is determined by both the number of
          outliers >= 'num.outlier' as well as the number of
          permutations, so too  few permutations may result in a
          p-value that doesn't look very significant even if it is.

     fdr: The expected number of gene pairs with at least as many
          outliers as 'num.outlier'. This can be converted to a %FDR by
          dividing by the observed value.

_A_u_t_h_o_r(_s):

     James W. MacDonald

_R_e_f_e_r_e_n_c_e_s:

     Tomlins, SA, et al. Recurrent fusion of TMPRSS2 and ETS
     transcription factor genes in prostate cancer. Science. 2005 Oct
     28;310(5748):644-8.

