scoring                package:macat                R Documentation

_C_o_m_p_u_t_e (_r_e_g_u_l_a_r_i_z_e_d) _t-_s_c_o_r_e_s _f_o_r _g_e_n_e _e_x_p_r_e_s_s_i_o_n _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     This function computes for all genes in an expression matrix the
     (regularized) t-scores (statistics) with the given class labels
     and a number of  permutations of these labels. Each gene is also
     assigned a p-value either  empirically from the permutation scores
     or from a t-distribution.

_U_s_a_g_e:

     scoring(data, labels, method = "SAM", pcompute = "tdist", 
             nperms = 1000, memory.limit = TRUE, verbose = TRUE)

_A_r_g_u_m_e_n_t_s:

    data: Expression matrix with rows = genes and columns = samples

  labels: Vector or factor of class labels; Scoring works only with two
          classes!

  method: Either "SAM" to compute regularized t-scores, or "t.test" to
          compute Student's t-statistic

pcompute: Method to compute p-values for each genes, either "empirical"
          to do permutations and compute p-values from them, or "tdist"
          to compute p-values based on respective t-distribution

  nperms: Number of permutations of the labels to be investigated, if
          argument 'pcompute="empirical"'

memory.limit: Logical, if you have a really good computer (>2GB RAM), 
          setting this FALSE will increase speed of computations

 verbose: Logical, if progress should be reported to STDOUT

_D_e_t_a_i_l_s:

     If 'pcompute="empirical"', the statistic is computed based on the
     given class labels, afterwards for 'nperms' permutations of the
     labels. The p-value for each gene is then the proportion of
     permutation statistics that are higher or equal than the statistic
     from the real labels. For each gene the 2.5%- and the
     97.5%-quantile of the permutation statistics are also returned as
     lower and upper 'significance threshold'.

     If 'pcompute="tdist", the statistic is computed only based on the
     given class labels, and the p-value is computed from the
     t-distribution with (Number of samples - 2) degrees of freedom.

_V_a_l_u_e:

     A list, with four components: 

observed: (Regularized) t-scores for all genes based on the given
          labels

 pvalues: P-values for all genes, either from permutations or
          t-distribution

expected.lower: 2.5%-quantile of permutation test-statistics, supposed
          to be a lower 'significance border' for the gene; or NULL if
          p-values were computed from t-distribution

expected.upper: 2.5%-quantile of permutation test-statistics, supposed
          to be an upper 'significance border' for the gene; or NULL if
          p-values were computed from t-distribution

_N_o_t_e:

     In MACAT this function is only called internally by 'evalScoring'

_A_u_t_h_o_r(_s):

     MACAT development team

_R_e_f_e_r_e_n_c_e_s:

     Regarding the regularized t-score please see the 'macat' vignette.

_S_e_e _A_l_s_o:

     'evalScoring'

_E_x_a_m_p_l_e_s:

      ## Not run: 
       data(stjd)
       # compute scores for T- vs. B-lymphocyte ALL:
       isT <- as.numeric(stjd$labels=="T")
       TvsB <- scoring(stjd$expr,isT,method="SAM",pcompute="empirical",nperms=100)
       summary(TvsB$observed)
      
     ## End(Not run)

