normalizeChannels          package:cellHTS          R Documentation

_N_o_r_m_a_l_i_z_a_t_i_o_n _o_f _d_u_a_l-_c_h_a_n_n_e_l _d_a_t_a _a_n_d _d_a_t_a _t_r_a_n_s_f_o_r_m_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Normalizes and/or transforms dual-channel data 'xraw' of a given
     'cellHTS' object by applying the function defined in 'fun'. The
     default is to take the ratio between the second and first channels
     (r2/r1). Correction of plate-to-plate variations may also be
     performed.

_U_s_a_g_e:

     normalizeChannels(x, fun = function(r1,r2) r2/r1, log = FALSE, adjustPlates, zscore, posControls, negControls, ...)

_A_r_g_u_m_e_n_t_s:

       x: a cellHTS object that has already been configured. See
          details.

     fun: a function defined by the user to relate the signal in the
          two channels 'r1' and 'r2'. 'fun' takes two numeric vectors
          and returns a numeric vector of the same length. The default
          is to take the ratio between the second and first channels.

     log: a logical value indicating whether the result obtained after
          applying  'fun' should be 'log2' transformed. The default is
          'log = FALSE',  and the data is not 'log2' transformed.

adjustPlates: character string indicating the correction method to
          apply to adjust for plate-to-plate variations (and 
          eventually well-to-well variations), after applying 'fun' and
          eventually log transforming the values. Allowed values are
          '"median"', '"mean"', '"shorth"', '"POC"', '"NPI"',
          '"negatives"' and 'Bscore'. If 'adjustPlates' is missing (the
          default), no plate-wise correction will be performed. See
          details.

  zscore: indicates if the _z_-scores should be determined after
          normalization and transformation. If missing (default), the
          data will not be scored. Otherwise, it should be a character
          string, either "+" or "-", specifying the sign to use for the
          calculated  _z_-scores. See details.

posControls: a vector of regular expressions giving the name of the
          positive control(s). See details.

negControls: a vector of regular expressions giving the name of the
          negative control(s). See details.

     ...: Further arguments that get passed on to the function
          implementing the normalization method chosen by
          'adjustPlates'. Currently, this is only used for  'Bscore'. 

_D_e_t_a_i_l_s:

     For each plate and replicate of a two-color experiment, the
     function defined in 'fun' is applied to relate the intensity
     values in the two channels of the 'cellHTS' object. The default is
     to calculate the ratio between the second and the first channels,
     but other options can be defined.

     If 'log = TRUE', the data obtained after applying 'fun' is 'log2'
     transformed.  The default is 'log = FALSE'.

     If 'adjustPlates' is not missing, the obtained values will be
     further corrected for plate effects by considering the chosen
     normalization method. The available options are:

        *  If 'adjustPlates="median"' (median scaling), plates effects
           are corrected by dividing each measurement  by the median
           value across wells annotated as 'sample' in 'x$wellAnno',
           for each plate and replicate. If the data values are in
           'log2' scale ('log=TRUE'), the per-plate factor is
           subtracted from each measurement, instead.

        *  If 'adjustPlates="mean"' (mean scaling), the average in the
           'sample' wells is consider instead. If the data values are
           in 'log2' scale ('log=TRUE'), the per-plate factor is
           subtracted from each measurement, instead.

        *  If 'adjustPlates="shorth"' (scaling by the midpoint of the
           shorth), for each plate and replicate, the midpoint of the
           'shorth' of the distribution of values in the wells
           annotated  as 'sample' is calculated. Then, every
           measurement is divided by this value (if 'log=FALSE') or
           subtracted by it (if 'log=TRUE', meaning that data have been
           log transformed). 

        *  If 'adjustPlates="POC"' (percent of control), for each plate
           and replicate, each measurement is divided by the average of
           the measurements on the plate positive controls, and
           multipliplied by 100.

        *  If 'adjustPlates="negatives"', for each plate and replicate,
           each measurement is divided  by the median of the
           measurements on the plate negative controls. If the data
           values are in 'log2' scale ('log=TRUE'), the per-plate
           factor is subtracted from each measurement, instead.

        *  If 'adjustPlates="NPI"' (normalized percent inhibition),
           each measurement is subtracted from the average of the
           intensities on the plate positive controls, and this result
           is divided by the difference between  the means of the
           measurements on the positive and the negative controls.

        *  If 'adjustPlates="Bscore"' (Bscore), for each plate and
           replicate, the B score method is applied to remove plate
           effects and row and column biases.

     By default, 'adjustPlates' is missing.

     If 'zscore' is not missing, a robust _z_-score for each individual
     measurement will be determined for each plate and each well by
     subtracting the overall median and dividing by the overall mad.
     The overall median and mad are taken by considering the
     distribution of intensities (over all plates) in the wells whose
     content is annotated as 'sample'. The allowed values for 'zscore'
     ("+" or "-") are used to set the sign of  the calculated
     _z_-scores. For example, with a 'zscore="-"' a strong decrease  in
     the signal will be represented by a positive _z_-score, whereas
     setting 'zscore="+"',  such a phenotype will be represented by a
     negative _z_-score.   This option can be set to calculate the
     results to the commonly used convention.

     The arguments 'posControls' and/or 'negControls' are required for
     applying the normalization methods based on the control
     measurements (that is, when 'adjustPlates="POC"', or
     'adjustPlates="NPI"' or 'adjustPlates="negatives"'). 'posControls'
     and 'negControls' should be given as a vector of regular
     expression patterns specifying the name of the positive(s) and
     negative(s) controls, respectivey, as provided in the plate
     configuration file (and stored in 'x$wellAnno'). The length of
     these vectors should be equal to the final number of reporters,
     which in this case is always one. By default, if 'posControls' is
     not given, "pos" will be taken as the name for the wells
     containing positive controls. Similarly, if 'negControls' is
     missing, by default "neg" will be considered as the name used to
     annotate the negative controls. The content of 'posControls' and
     'negControls' will be passed to 'regexpr' for pattern matching
     within the well annotation given in 'x$wellAnno' (see examples). 
     The arguments 'posControls' and 'negControls' are particularly
     useful in multi-channel data since the controls might be
     reporter-specific, or after normalizing multi-channel data.

_V_a_l_u_e:

     An object of class 'cellHTS', which is a copy of the argument 'x',
     plus an additional slot 'xnorm' containing the normalized data.
     This is an array of the same dimensions as 'xraw', except in the
     dimension corresponding to the number of channels, since the
     two-channel intensities have been combined into one intensity
     value.

     Moreover, the processing status of the 'cellHTS' object is updated
     in the slot 'state' to 'x$state["normalized"]=TRUE'.  

     Additional outputs may be given if 'adjustPlates="Bscore"'. Please
     refer to the help page of the 'Bscore' function.

_A_u_t_h_o_r(_s):

     Ligia Braz ligia@ebi.ac.uk, Wolfgang Huber huber@ebi.ac.uk

_S_e_e _A_l_s_o:

     'normalizePlates',  'summarizeChannels', 'Bscore',

_E_x_a_m_p_l_e_s:

      ## Not run: 
         datadir <- system.file("DualChannelScreen", package = "cellHTS")
         x <- readPlateData("Platelist.txt", "TwoColorData", path=datadir)
         x <- configure(x, "Plateconf.txt", "Screenlog.txt", "Description.txt", path=datadir)
         table(x$wellAnno)

         ## Define the controls for the different channels:
         negControls=vector("character", length=dim(x$xraw)[4])

         ## channel 1 - gene A
         ## case-insensitive and match the empty string at the beginning and end of a line (to distinguish between "geneA" and "geneAB", for example, although this is not a problem for the well annotation in this example)

         negControls[1]= "(?i)^geneA$"  
         ## channel 2 - gene A and geneB
         negControls[2]= "(?i)^geneA$|^geneB$" 
         posControls = vector("character", length=dim(x$xraw)[4])
         ## channel 1 - no controls
         ## channel 2 - geneC and geneD
         posControls[2]="(?i)^geneC$|^geneD$"

         writeReport(x, posControls=posControls, negControls=negControls)
         x = normalizeChannels(x, fun=function(x,y) y/x, log=TRUE, adjustPlates="median")
         ## Define the controls for the normalized intensities (only one channel):
         negControls = vector("character", length=dim(x$xnorm)[4])
         ## For the single channel, the negative controls are geneA and geneB 
         negControls[1]= "(?i)^geneA$|^geneB$" 
         posControls = vector("character", length=dim(x$xnorm)[4])
         ## For the single channel, the negative controls are geneC and geneD 
         posControls[1]="(?i)^geneC$|^geneD$"
         writeReport(x, force=TRUE, plotPlateArgs=list(xrange=c(-3,3)), 
              posControls=posControls, negControls=negControls)
      ## End(Not run)

