Agi4x44PreProcess-package package:Agi4x44PreProcess R Documentation

_P_r_e_P_r_o_c_e_s_s_i_n_g _o_f _A_g_i_l_e_n_t _4_x_4_4 _a_r_r_a_y _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     Agi4x44PreProcess Package Overview

_D_e_t_a_i_l_s:

     The package allows the preprocessing of Agilent 4x44 array data
     produced by the   Agilent Feature Extraction (AFE) image analysis
     software.   The AFE extracts foreground and background signals, as
     well as some quality flags. All the extracted information is
     assembled into the componenents of a 'RGList' object (see 'limma'
     package)

     The preprocessing includes: background correction, normalization
     and filtering  probes according to different quality flags that
     are produced by the AFE.  

     A 'target' file and the corresponding data files  produced by the
     AFE image analysis software are required as inputs. 

     The preprocessing steps are the following:  -       reading the
     targets file  -       reading the array data samples obtained with
     AFE -       Background correction -       Normalization between
     samples -       Filtering probes by their Quality Flag -      
     Summarizing replicated probes  -       Creating and ExpressionSet
     object with the processed data

     The package also contains two specific functions that allow the
     users to explore the architecture  of the chip in terms of probe
     replication and gene replication.   In the first case, it
     identifies non-control replicated probes (Probe Sets) that  are
     spread over the chip with the propouse of evaluating its
     reproducibility.    In the second case, it picks those genes
     (according to the ACCNUM  code obtained from the corresponding
     Bioconductor annotation package) that are  interrogated by
     different probes in different locations. These groups of genes are
      termed 'Gene Sets' .

     The package also contains standard graphical microarray utilities
     that allow the users  to evaluate the quality of the data. These
     graphics also allow   to make a decision about what sort of
     foreground and background signals, among those provided  by the
     AFE, are going to be used in the analysis. A graphical inspection
     of the data also might help to dedice what background signal
     correction and normalization between samples could be more
     suitable to perform. 

     There are also utility functions that write files across different
     stages of  the processing protocol. These files include the probes
     list, with information such as their quality flag, normalized
     intensity and the corresponding information obtained from its
     annotation package.

_A_u_t_h_o_r(_s):

     Pedro Lopez-Romero  plopez@cnic.es

_R_e_f_e_r_e_n_c_e_s:

     Agilent Feature Extraction Reference Guide
     url{http://www.Agilent.com}

     Gordon K. Smyth, M. Ritchie, N. Thorne, J. Wettenhall (2007).
     limma: Linear Models for Microarray Data User's Guide.

     Bolstad, B. M. (2001), Probe level quantile normalization of high
     density oligonucleotide array data. Unpublished Manuscript: <URL:
     http://bmbolstad.com/stuff/qnorm.pdf>

     Bolstad, B. M., Irizarry R. A., Astrand, M., and Speed, T. P.
     (2003), A comparison of normalization methods for high density
     oligonucleotide array data based on bias and variance.
     Bioinformatics 19, 185-193.

     Smyth, G. K. (2005). Limma: linear models for microarray data. In:
     'Bioinformatics and Computational Biology Solutions Using R and
     Bioconductor'. R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W.
     Huber (eds), Springer, New York, pages 397 - 420

_E_x_a_m_p_l_e_s:

             ## Not run:  reading target file and Agilent Feature Extraction data files

                    targets=read.targets(infile="targets.txt")
                    dd=read.AgilentFE(targets,makePLOT=TRUE)
             ## End(Not run)

             ## Not run: 
             data(dd)
             data(targets)   
             
     ## End(Not run)
             ## Not run: Non-Control replicated Probes

             ## Not run: 
             CV.rep.probes(dd,"hgug4112a.db",
                     foreground="MeanSignal",raw.data=TRUE,writeR=TRUE,targets)
             
     ## End(Not run)
             ## Not run: genes replicated - ensembl 

             ## Not run: 
             genes.rpt.agi(dd,annotation.package="hgug4112a.db",raw.data=TRUE,
                     WRITE.html=TRUE,REPORT=TRUE)
             
     ## End(Not run)
             ## Not run: NORMALIZATION (here the foreground and background are chosen)

             ## Not run: 
             ddNORM=BGandNorm(dd,BGmethod='half',NORMmethod='quantile',
                             foreground='MeanSignal',background='BGMedianSignal',
                             offset=50,makePLOTpre=TRUE,makePLOTpost=TRUE)
             
     ## End(Not run)
             ## Not run: FILTERING PROBES

             ## Not run: 
             ddFILT=filter.probes(ddNORM,
                     control=TRUE,
                     wellaboveBG=TRUE,
                     isfound=TRUE,
                     wellaboveNEG=TRUE,
                     sat=TRUE,
                     PopnOL=TRUE,
                     NonUnifOL=TRUE,
                     nas=TRUE,
                     limWellAbove=75,
                     limISF=75,
                     limNEG=75,
                     limSAT=75,
                     limPopnOL=75,
                     limNonUnifOL=75,
                     limNAS=100,
                     makePLOT=TRUE,annotation.package="hgug4112a.db",flag.counts=TRUE,targets)
             
     ## End(Not run)
             ## Not run: SUMMARIZING PROBES

             ## Not run: 
             ddPROC=summarize.probe(ddFILT,makePLOT=TRUE,targets)
             
     ## End(Not run)
             ## Not run: CREATING EXPRESIONSET OBJECT

             ## Not run: 
             esetPROC=build.eset(ddPROC,targets,makePLOT=TRUE,
                     annotation.package="hgug4112a.db")
             dim(esetPROC)

             ## End(Not run)
             ## Not run: WRITING EXPRESIONSET OBJECT: ProcessedData.txt

             ## Not run: 
             write.eset(esetPROC,ddPROC,"hgug4112a.db",targets)
             
     ## End(Not run)

             ## Not run:  MAPPING VARIABLE
             ## Not run: 
             mappings=build.mappings(esetPROC,annotation.package="hgug4112a.db")
             names(mappings)

             ## End(Not run)

             ## Not run: Gene Set Enrichment Analysis at: http://www.broad.mit.edu/gsea

             ## Not run: 
             gsea.files(esetPROC,targets,annotation.package="hgug4112a.db")
             
     ## End(Not run)

