qualityParameters         package:arrayMagic         R Documentation

_C_a_l_c_u_l_a_t_i_o_n _o_f _q_u_a_l_i_t_y _c_h_a_r_a_c_t_e_r_i_s_t_i_c_s _f_o_r
_D_N_A _m_i_c_r_o_a_r_r_a_y _h_y_b_r_i_d_i_s_a_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Several quality measures are calculated. The return value, i.e. a
     list of quality scores, should be used as input argument for the
     function 'qualityDiagnostics'. For details on the quality measures
     read the value section.

_U_s_a_g_e:

     qualityParameters(arrayDataObject, exprSetRGObject, spotIdentifier = "Name", slideNameColumn = "slideName", identifiersToBeSkipped=NA, resultFileName, verbose = TRUE)

_A_r_g_u_m_e_n_t_s:

arrayDataObject: object of type 'arrayData'; required; default: missing

exprSetRGObject: object of type 'exprSetRG'; required; default: missing

spotIdentifier: character string; required; specifies a column of
          'getSpotAttr(arrayDataObject)'; the column is used to
          determine spot replicas; default: "Name"

slideNameColumn: character string; required; specifies a column of
          'getHybAttr(arrayDataObject)'; the column is used to extract
          the names of the hybridisations; if not found the
          hybridisations are consecutively numbered; default:
          "slideName"

identifiersToBeSkipped: vector of character strings of spot identifiers
          to be excluded from calculations; required; default: 'NA'

resultFileName: character string; results are stored in a
          tab-deliminated file if supplied; default: missing

 verbose: logical; default 'TRUE'

_D_e_t_a_i_l_s:

     For details on the quality measures read the value section.

_V_a_l_u_e:

     returns a list of results, i.e. a 'data.frame' 'qualityParameters'
     containing several scores for each hybridisation, as well as
     pairwise comparisons, i.e. a matrix 'slideDistance', a matrix
     'slideDistanceLogRaw', a matrix 'slideDistanceGreen', a matrix
     'slideDistanceGreenLogRaw', a matrix 'slideDistanceRed', a matrix
     'slideDistanceRedLogRaw', and an integer 'replicateSpots', i.e.
     the number of detected spot replicas.

     The matrix 'slideDistanceLogRaw' contains a  calculated distance
     (similarity) for each pair of slides$_{ij}$, i.e. the median
     absolute deviation (mad) taken over all spots of the log-ratio of
     the raw data; alike the matrix 'slideDistance' the mad taken over
     all spots of the difference of the log-ratios  (here: the
     difference of the normalised and transformed expression values of
     the two channels on the slide). Similarly the matrices 
     'slideDistanceGreen', 'slideDistanceGreenLogRaw',
     'slideDistanceRed', and 'slideDistanceRedLogRaw' contain
     calculated distances for each pair of slides$_{ij}$ based on the
     mad of the difference of the same channel (normalised or logged)
     taken over all spots.

     A brief summary of all parameters given in the 'data.frame'
     'qualityParameters':

     'width' a robust estimate of the noise, i.e. the median absolute
     deviation of the difference of the normalised channels taken over
     all spots, i.e. the "width" of the scatterplot

     'medianDistance' a robust measure for the typical distance
     (similarity) of one slide with all other slides, i.e. the median
     of the "distances" between slides (c.f. 'slideDistance'))

     'correlation(LogRaw)' of the expression values between the two
     normalised (log raw) channels of the slide taken over all spots

     'meanSignalGreen' the mean taken over all spots of the green raw
     data channel 

     'meanSignalRed' the mean taken over all spots of the red raw data
     channel

     'meanSignal' mean taken over all spots of the raw data of both
     channels,

     'signalRangeGreen' the range between the 10th and 95th percentile
     of the signal intensities given in the green raw data channel

     'signalRangeRed' the range between the 10th and 95th percentile of
     the signal intensities given in the red raw data channel

     'backgroundRangeGreen' the range between the 10th and 95th
     percentile of the background intensities given in the green raw
     data channel

     'backgroundRangeRed' the range between the 10th and 95th
     percentile of the background intensities given in the red raw data
     channel

     'signalToBackgroundGreen' the ratio of the median signal intensity
     and the median background intensity given in the green raw data
     channel

     'signalToBackgroundRed' the ratio of the median signal intensity
     and the median background intensity given in the red raw data
     channel

     'spotReplicatesConcordanceGreen(LogRaw)' the median of the
     standard deviations of all spot replicas for each unique
     identifier of the normalised (log raw) green channel is
     calculated; in case of duplicates, i.e. 'replicateSpots == 2', the
     Pearson and Spearman correlation is calculated instead

     'spotReplicatesConcordanceGreen(LogRaw)' the median of the
     standard deviations of all spot replicas for each unique
     identifier of the normalised (log raw) green channel is
     calculated; in case of duplicates, i.e. 'replicateSpots == 2', the
     Pearson and Spearman correlation is calculated instead

     'greenvsAllGreen' and 'redvsAllRed' the correlation between each
     channel is measured against the averaged (median) channel over all
     hybridisations (like a virtual reference) separately for each
     channel

_A_u_t_h_o_r(_s):

     Andreas Buness <a.buness@dkfz.de>

_S_e_e _A_l_s_o:

     'qualityDiagnostics'

_E_x_a_m_p_l_e_s:

           spotIdentifierVec <- c("A","A","Blank","B","B","Blank")
           hybNames <- "H1"
           R1 <- N1 <- c(1,1,9,2,2,10)
           R2 <- N2 <- c(2,2,7,4,4,8)
           rawDataIntensityValues <- array(0, dim=c(6,2,1))
           rawDataIntensityValues[,1,] <- R1
           rawDataIntensityValues[,2,] <- R2
           dimnames(rawDataIntensityValues) <- list(NULL, c("green","red"), NULL)
           spotAttr <- data.frame(Name=I(spotIdentifierVec))
           hybAttr <- data.frame(slideName=I(hybNames))
           arrayDataObject <- new("arrayData", intensities=rawDataIntensityValues, hybAttrList=list(red=hybAttr,green=hybAttr), spotAttr=spotAttr)
           indGreen <- 1
           indRed <- 2
           channels <- matrix( c(indGreen,indRed), nrow=length(indGreen), byrow=FALSE )
           colnames(channels) <- c("green","red")
           exprSetRGObject <- new("exprSetRG",       
           exprs <- matrix(c(R1,R2), nrow=6, byrow=FALSE), phenoData=        
               new("phenoData", pData=data.frame(matrix(0,nrow=2,ncol=1)),
                   varLabels=list(rep("varLabel1",1))), channels=channels)           
           Re1 <- qualityParameters(arrayDataObject=arrayDataObject, exprSetRGObject=exprSetRGObject, identifiersToBeSkipped= "Blank")
           stopifnot(all.equal.numeric(as.numeric(Re1$qualityParameters["H1",c("correlation")]),c(1)))
           stopifnot(Re1$replicateSpots==2)

        

             

