readCel              package:affxparser              R Documentation

_R_e_a_d_s _a_n _A_f_f_y_m_e_t_r_i_x _C_E_L _f_i_l_e

_D_e_s_c_r_i_p_t_i_o_n:

     This function reads all or a subset of the data in an Affymetrix 
     CEL file.

_U_s_a_g_e:

     readCel(filename, 
             indices = NULL, 
             readHeader = TRUE, 
             readXY = FALSE, readIntensities = TRUE,
             readStdvs = FALSE, readPixels = FALSE,
             readOutliers = TRUE, readMasked = TRUE, 
             readMap = NULL,
             verbose = 0,
             .checkArgs = TRUE)

_A_r_g_u_m_e_n_t_s:

filename: the name of the CEL file.

 indices: a vector of indices indicating which features to read. If the
          argument is 'NULL' all features will be returned.

  readXY: a logical: will the (x,y) coordinates be returned.

readIntensities: a logical: will the intensities be returned.

readStdvs: a logical: will the standard deviations be returned.

readPixels: a logical: will the number of pixels be returned.

readOutliers: a logical: will the outliers be return.

readMasked: a logical: will the masked features be returned.

readHeader: a logical: will the header of the file be returned.

 readMap: A 'vector' remapping cell indices to  file indices.  If
          'NULL', no mapping is used.

 verbose: how verbose do we want to be. 0 is no verbosity, higher
          numbers mean more verbose output. At the moment the values 0,
          1 and 2 are supported.

.checkArgs: If 'TRUE', the arguments will be validated, otherwise not. 
          _Warning: This should only be used if the arguments have been
          validated elsewhere!_

_V_a_l_u_e:

     A CEL files consists of a _header_, a set of _cell values_, and
     information about _outliers_ and 'masked' cells.

     The cell values, which are values extract for each cell (aka
     feature or probe), are the (x,y) coordinate, intensity and
     standard deviation estimates, and the number of pixels in the
     cell.   If 'readIndices=NULL', cell values for all cells are
     returned, Only cell values specified by argument 'readIndices' are
     returned.

     This value returns a named list with compontents described below: 

'header': The header of the CEL file. Equivalent to the  output from
          'readCelHeader', see the documentation for that function.

     x,y: (cell values) Two 'integer' vectors containing the x and y
          coordinates associated with each feature.

'intensities': (cell value) A  'numeric' vector containing the
          intensity associated with each feature.

   stdvs: (cell value) A 'numeric' vector containing  the standard
          deviation associated with each feature.

  pixels: (cell value) An 'integer' vector containing  the number of
          pixels associated with each feature.

outliers: An 'integer' vector of indices specifying which of the
          queried cells that are flagged as outliers. Note that there
          is a difference between 'outliers=NULL' and
          'outliers=integer(0)'; the last case happens when 
          'readOutliers=TRUE' but there are no outliers.

  masked: An 'integer' vector of indices specifying which of the
          queried cells that are flagged as masked. Note that there is
          a difference between 'masked=NULL' and 'masked=integer(0)';
          the last case happens when  'readMasked=TRUE' but there are
          no masked features.

     normal-bracket73bracket-normal

     The elements of the cell values are ordered according to argument 
     'indices'.  The lengths of the cell-value elements equals the
     number of cells read.

     Which of the above elements that are returned are controlled by
     the  'readNnn' arguments.  If 'FALSE', the corresponding element 
     above is 'NULL', e.g. if 'readStdvs=FALSE' then  'stdvs' is
     'NULL'.

_O_u_t_l_i_e_r_s _a_n_d _m_a_s_k_e_d _c_e_l_l_s:

     The Affymetrix image analysis software flags cells as outliers and
     masked. This method does not return these flags, but instead
     vectors of cell  indices listing which cells _of the queried
     cells_ are outliers and masked, respectively. The current
     community view seems to be that this should be done based on
     statistical modelling of the actual probe intensities and should
     be based on the choice of preprocessing algorithm.  Most
     algorithms are only using the intensities from the CEL file.

_M_e_m_o_r_y _u_s_a_g_e:

     The Fusion SDK allocates memory for the entire CEL file, when the
     file is accessed (but does not actually read the file into
     memory). Using the 'indices' argument will therefore only affect
     the memory use of the final object (as well as speed), not the
     memory allocated in the C function used to parse the file. This
     should be a minor problem however.

_T_r_o_u_b_l_e_s_h_o_o_t_i_n_g:

     It is considered a bug if the file contains information not
     accessible  by this function, please report it.

_A_u_t_h_o_r(_s):

     James Bullard, bullard@stat.berkeley.edu and Kasper Daniel Hansen,
     khansen@stat.berkeley.edu

_S_e_e _A_l_s_o:

     'readCelHeader()' for a description of the header output. Often a
     user only wants to read the intensities, look at
     'readCelIntensities()' for a function specialized for  that use.

_E_x_a_m_p_l_e_s:

       for (zzz in 0) {  # Only so that 'break' can be used

       # Scan current directory for CEL files
       celFiles <- list.files(pattern="[.](c|C)(e|E)(l|L)$")
       if (length(celFiles) == 0)
         break;

       celFile <- celFiles[1]

       # Read a subset of cells
       idxs <- c(1:5, 1250:1500, 450:440)
       cel <- readCel(celFile, indices=idxs, readOutliers=TRUE)
       str(cel)

       # Clean up
       rm(celFiles, celFile, cel)

       } # for (zzz in 0)

