BASH                package:beadarray                R Documentation

_B_A_S_H - _B_e_a_d_A_r_r_a_y _S_u_b_v_e_r_s_i_o_n _o_f _H_a_r_s_h_l_i_g_h_t

_D_e_s_c_r_i_p_t_i_o_n:

     BASH is an automatic detector of physical defects on an array. It
     is designed to detect three types of defect - COMPACT, DIFFUSE and
     EXTENDED.

_U_s_a_g_e:

     BASH(BLData, array, compact = TRUE, diffuse = TRUE, extended = TRUE, log = TRUE, cinvasions = 10, dinvasions = 15, einvasions = 20, bgcorr = "median", maxiter = 10, compcutoff = 8, compdiscard = TRUE, diffcutoff = 10, diffsig = 0.0001, diffn = 3, difftwotail = FALSE)

_A_r_g_u_m_e_n_t_s:

  BLData: 'BeadLevelList'

   array: integer specifying which strip/array to plot. Alternatively
          you can supply a vector of strip/array IDs, and BASH will
          analyse each in turn.

 compact: Logical - Perform compact analysis?

 diffuse: Logical - Perform diffuse analysis?

extended: Logical - Perform extended analysis?

     log: Logical - Perform analyses on the log scale? (recommended)

cinvasions: Integer - number of invasions used whenever closing the
          image - see 'BASHCompact'

dinvasions: Integer - number of invasions used in diffuse analysis, to
          find the kernel - see 'BASHDiffuse'

einvasions: Integer - number of invasions used when filtering the error
          image - see 'BGFilter'.

  bgcorr: One of "none", "median", "medianMAD" - Used in diffuse
          analysis, this determines how we attempt to compensate for
          the background varying across an array. For example, on a SAM
          array this should be left at "median", or maybe even switched
          to "none", but if analysing a large beadchip then you might
          consider setting this to "medianMAD". (this code is passed to
          the 'method' argument of 'BGFilter').

 maxiter: Integer - Used in compact analysis - the max number of
          iterations allowed. (Exceeding this results in a warning.)

compcutoff: Integer - the threshold used to determine whether a group
          of outliers is in a compact defect. In other words, if a
          group of at least this many connected outliers is found, then
          it is labelled as a compact defect.

compdiscard: Logical - should we discard compact defect beads before
          doing the diffuse analyis?

diffcutoff: Integer - this is the threshold used to determine the
          minimum size that clusters of diffuse defects must be.

 diffsig: Probability - The significance level of the binomial test
          performed in the diffuse analysis.

   diffn: Numerical - when finding outliers on the diffuse error image,
          how many MADs away from the median an intensity must be for
          it to be labelled an outlier.

difftwotail: Logical - If TRUE, then in the diffuse analysis, we
          consider the high outlier and low outlier images seperately.

_D_e_t_a_i_l_s:

     The 'BASH' pipeline function performs three types of defect
     analysis on an image.

     The first, COMPACT DEFECTS, finds large clusters of outliers, as
     per 'BASHCompact'. The outliers are found using
     'findAllOutliers()'. We then find which outliers are clustered
     together. This process is iterative - having found a compact
     defect, we remove it, and then see if any more defects are found.

     The second, DIFFUSE DEFECTS, finds areas which are densely
     populated with outliers (which are not necessarily connected), as
     per 'BASHDiffuse'. To make this type of defect more obvious, we
     first generate an ERROR IMAGE, and then find outliers based on
     this image. (The error image is calculated by using 'method =
     "median"' and 'bgfilter = "medianMAD"' in 'generateE', unless
     'ebgcorr = FALSE' in which case we use 'bgfilter = "median"'.) Now
     we consider a neighbourhood around each bead and count the number
     of outlier beads in this region. Using a binomial test we
     determine whether this is more that we would expect if the
     outliers were evenly spread over the entire array. If so, we mark
     it as a diffuse defect. (A clustering algorithm similar to the
     compact defect analysis is run to reduce false positives.)

     After each of these two analyses, we "close" the image, filling in
     gaps.

     The third, EXTENDED DEFECTS, returns a score estimating how much
     the background is changing across an array, as per 'BASHExtended'.
     To estimate the background intensity, we generate an error image
     using the median filter (i.e. 'generateE' with 'method = "median"'
     and 'bgfilter = "median"'). We divide the variance of this by the
     variance of an error image without using the median filter, to
     obtain our extended score.

     It should be noted that to avoid repeated computation of distance,
     a "neighbours" matrix is used in the analysis. This matrix
     describes which beads are close to other beads. If a large number
     of beads are missing (for example, if beads with ProbeID = 0 were
     discarded) then this algorithm may be affected.

     For more detailed descriptions of the algorithms, read the help
     files of the respective functions listed in "see also".

_V_a_l_u_e:

     The output is a list with three attributes:

     wts: A list, where the ith object in the list corresponds to the
     weights for array i.

     ext: A vector of extended scores (null if the extended analysis
     was disabled)

     call: The function you used to call BASH.

_A_u_t_h_o_r(_s):

     Jonathan Cairns

_R_e_f_e_r_e_n_c_e_s:

     Mayte Suarez-Farinas, Maurizio Pellegrino, Knut M. Wittkwosky and
     Marcelo O. Magnasco (2007). Harshlight: A "corrective make-up"
     program for microarray chips. R package version 1.8.0.
     http://asterion.rockefeller.edu/Harshlight/

_S_e_e _A_l_s_o:

     'BASHCompact', 'BASHDiffuse', 'BASHExtended', 'generateE',
     'generateNeighbours',

_E_x_a_m_p_l_e_s:

             data(BLData)
             output <- BASH(BLData,array=1:4)
             boxplot(output$ext) #view spread of extended scores
             for(i in 1:4)
             {
                     BLData <- setWeights(BLData, output$wts[[i]], i) #apply BASH weights to BLData
             }

             #diffuse test is stricter
             output <- BASH(BLData, diffsig = 0.00001,array=1)

             #more outliers on the error image are used in the diffuse analysis
             output <- BASH(BLData, diffn = 2,array=1)

             #only perform compact & diffuse analyses (we will only get weights)
             output <- BASH(BLData, extended = FALSE,array=1)

             #attempt to correct for background.
             output <- BASH(BLData, bgcorr = "median",array=1)

