farms                  package:xps                  R Documentation

_F_a_c_t_o_r _A_n_a_l_y_s_i_s _f_o_r _R_o_b_u_s_t _M_i_c_r_o_a_r_r_a_y _S_u_m_m_a_r_i_z_a_t_i_o_n _E_x_p_r_e_s_s_i_o_n _M_e_a_s_u_r_e

_D_e_s_c_r_i_p_t_i_o_n:

     This function converts a 'DataTreeSet' into an 'ExprTreeSet' using
     the Factor Analysis for Robust Microarray Summarization (FARMS)
     method.

_U_s_a_g_e:

     farms(xps.data,
           filename   = character(0),
           filedir    = getwd(),
           tmpdir     = "",
           normalize  = TRUE,
           weight     = 0.5,
           mu         = 0.0,
           scale      = 1.0,
           tol        = 0.00001,
           cyc        = 100,
           weighted   = TRUE,
           version    = "1.3.1",
           option     = "transcript",
           exonlevel  = "",
           xps.scheme = NULL,
           add.data   = TRUE,
           verbose    = TRUE)

_A_r_g_u_m_e_n_t_s:

xps.data: object of class 'DataTreeSet'.

filename: file name of ROOT data file.

 filedir: system directory where ROOT data file should be stored.

  tmpdir: optional temporary directory where temporary ROOT files
          should be stored.

normalize: logical. If 'TRUE' normalize data using quantile
          normalization.

  weight: hyperparameter, usually set to 0.5 for 'version="1.3.1"' and
          to 8.0 for 'version="1.3.0"'.

      mu: hyperparameter allowing to correct for potential bias.

   scale: scaling parameter, usually set to 1.0 for 'version="1.3.1"'
          and to 2.0 for 'version="1.3.0"'.

     tol: termination tolerance for EM algorithm.

     cyc: maximum number of cycles of EM algorithm.

weighted: logical, used only with 'version="1.3.1"'. Default is TRUE.

 version: version of original farms package. Currently,
          'version="1.3.1"' and 'version="1.3.0"' are implemented.
          Default is 'version="1.3.1"'.

  option: option determining the grouping of probes for summarization,
          one of  transcript, exon, probeset; exon arrays only.

exonlevel: exon annotation level determining which probes should be
          used for summarization; exon/genome arrays only.

xps.scheme: optional alternative 'SchemeTreeSet'.

add.data: logical. If 'TRUE' expression data will be included as slot
          'data'.

 verbose: logical, if 'TRUE' print status information.

_D_e_t_a_i_l_s:

     This function computes the FARMS (Factor Analysis for Robust
     Microarray Summarization) expression  measure described in
     Hochreiter et al. for both expression arrays and exon arrays. 

     Parameter 'version' currently allows the user to choose between
     the original implementation  of FARMS as implemented in package
     farms_1.3.0 or enhanced FARMS as implemented in  package
     farms_1.3.1. By default 'version="1.3.1"' is used.

     Parameter 'weight' is a hyperparameter which determines the
     influence of the prior. For  'version="1.3.1"' the value in the
     range of [0,1]. 

     Parameter 'mu' is a hyperparameter which allows to quantify
     different aspects of potential  prior knowledge. Values near zero
     assume that most genes do not contain a signal and introduce  a
     bias for loading matrix elements near zero. 

     Parameter 'weighted' is a logical and indicates whether a weighted
     mean or a least square  fit is used to summarize the loading
     matrix. It is applicable only to 'version="1.3.1"'. 

     For exon arrays it is necessary to supply the requested 'option'
     and 'exonlevel'.

     Following 'option's are valid for exon arrays:

       'transcript':  expression levels are computed for transcript clusters, i.e. probe sets containing the same transcript_cluster_id.
       'exon':        expression levels are computed for exon clusters, i.e. probe sets containing the same exon_id, where each exon cluster consists of one or more 'probeset's.
       'probeset':    expression levels are computed for individual probe sets, i.e. for each probeset_id.

     Following 'exonlevel' annotations are valid for exon arrays:

         'core':          probesets supported by RefSeq and full-length GenBank transcripts.
         'metacore':      core meta-probesets.
         'extended':      probesets with other cDNA support.
         'metaextended':  extended meta-probesets.
         'full':          probesets supported by gene predictions only.
         'metafull':      full meta-probesets.
         'affx':          standard AFFX controls.
         'all':           combination of above (including affx).

     Following 'exonlevel' annotations are valid for whole genome
     arrays:

         'core':      probesets with category unique, similar and mixed.
         'metacore':  probesets with category unique only.
         'affx':      standard AFFX controls.
         'all':       combination of above (including affx).

     Exon levels can also be combined, with following combinations
     being most useful:

       'exonlevel="metacore+affy"':       core meta-probesets plus AFFX controls
       'exonlevel="core+extended"':       probesets with cDNA support
       'exonlevel="core+extended+full"':  supported plus predicted probesets

     Exon level annotations are described in the Affymetrix whitepaper
     exon_probeset_trans_clust_whitepaper.pdf: 
       Exon Probeset Annotations and Transcript Cluster Groupings.

     In order to use an alternative 'SchemeTreeSet' set the
     corresponding SchemeSet 'xps.scheme'.

_V_a_l_u_e:

     An 'ExprTreeSet'

_N_o_t_e:

     The expression measure obtained with FARMS is given in linear
     scale, analogously to the expression  measures computed with
     'mas5' and 'rma'. 

     For the analysis of many exon arrays it may be better to define a
     'tmpdir', since this will store only the results in the main file
     and not e.g. background and normalized intensities, and thus will
     reduce the file size of the main file. For quantile normalization
     memory should not be an issue, however DFW depends on RAM unless
     you are using a temporary file.

_A_u_t_h_o_r(_s):

     Christian Stratowa

_R_e_f_e_r_e_n_c_e_s:

     Hochreiter, S., Clevert D.-A., and Obermayer, K. (2006), A new
     summarization method for  Affymetrix probe level data.
     Bioinformatics 22(8):943-949

_S_e_e _A_l_s_o:

     'express'

_E_x_a_m_p_l_e_s:

     ## first, load ROOT scheme file and ROOT data file
     scheme.test3 <- root.scheme(paste(.path.package("xps"),"schemes/SchemeTest3.root",sep="/"))
     data.test3 <- root.data(scheme.test3, paste(.path.package("xps"),"rootdata/DataTest3_cel.root",sep="/"))

     data.farms <- farms(data.test3,"tmp_Test3FARMS",verbose=FALSE)

     ## get data.frame
     expr.farms <- validData(data.farms)
     head(expr.farms)

