readCdfUnits           package:affxparser           R Documentation

_R_e_a_d_s _u_n_i_t_s (_p_r_o_b_e_s_e_t_s) _f_r_o_m _a_n _A_f_f_y_m_e_t_r_i_x _C_D_F _f_i_l_e

_D_e_s_c_r_i_p_t_i_o_n:

     Gets the all or a subset of units (probesets) in an Affymetrix CDF
     file.

_U_s_a_g_e:

     readCdfUnits(filename, units=NULL, readXY=TRUE, readBases=TRUE, readExpos=TRUE, readType=TRUE, readDirection=TRUE, stratifyBy=c("nothing", "pmmm", "pm", "mm"), readIndices=FALSE, verbose=0)

_A_r_g_u_m_e_n_t_s:

filename: The filename of the CDF file.

   units: An 'integer' 'vector' of unit indices specifying which units
          to be read.  If 'NULL', all units are read.

  readXY: If 'TRUE', cell row and column (x,y) coordinates are
          retrieved, otherwise not.

readBases: If 'TRUE', cell P and T bases are retrieved, otherwise not.

readExpos: If 'TRUE', cell "expos" values are retrieved, otherwise not.

readType: If 'TRUE', unit types are retrieved, otherwise not.

readDirection: If 'TRUE', unit directions are retrieved, otherwise not.

stratifyBy: A 'character' string specifying which and how elements in
          group fields are returned. If '"nothing"', elements are
          returned as is, i.e. as 'vector's. If '"pm"'/'"mm"', only
          elements corresponding to  perfect-match (PM) / mismatch (MM)
          probes are returned (as 'vector's). If '"pmmm"', elements are
          returned as a matrix where the first row holds elements
          corresponding to PM probes and the second corresponding to MM
          probes.  Note that in this case, it is assumed  that there
          are equal number of PMs and MMs; if not, an error is
          generated.   Moreover, the PMs and MMs may not even be
          paired, i.e. there is no  guarantee that the two elements in
          a column corresponds to a  PM-MM pair.

readIndices: If 'TRUE', cell indices calculated from the row and column
          (X,Y) coordinates are retrieved, otherwise not.

 verbose: An 'integer' specifying the verbose level. If 0, the file is
          parsed quietly.  The higher numbers, the more details.

_V_a_l_u_e:

     A named 'list' where the names corresponds to the names of the
     units read.  Each element of the list is in turn a 'list'
     structure with three components: 

  groups: A 'list' with one component for each group  (also called
          block). The information on each group is a  'list' with five
          components: 'x', 'y',  'pbase', 'tbase', 'expos', and
          'indices'.

    type: An 'integer' specifying the type of the unit, where 1 is
          "expression", 2 is "genotyping", 3 is "CustomSeq",  and 4
          "tag".

direction: An 'integer' specifying the direction of the unit, which
          defines if the probes are interrogating the sense or the
          anti-sense target, where 0 is "no direction", 1 is "sense",
          and 2 is "anti-sense".

_A_u_t_h_o_r(_s):

     James Bullard, bullard@stat.berkeley.edu and Kasper Daniel Hansen,
     khansen@stat.berkeley.edu. Modified by Henrik Bengtsson (<URL:
     http://www.braju.com/R/>) to read any subset of units and/or
     subset of parameters, to stratify by PM/MM, and to return cell
     indices.

_R_e_f_e_r_e_n_c_e_s:

     [1] Affymetrix Inc, Affymetrix GCOS 1.x compatible file formats,
     June 14, 2005. <URL: http://www.affymetrix.com/support/developer/>

_S_e_e _A_l_s_o:

     To read unit names only, see 'readCdfUnitNames'(). To get which
     features are perfect-match probes and not, see 'readCdfIsPm'().

_E_x_a_m_p_l_e_s:

     for (zzz in 0) {

     # Find any CDF file
     cdfFile <- findCdf()
     if (is.null(cdfFile))
       break

     # Read all units in a CDF file [~20s => 0.34ms/unit]
     cdf0 <- readCdfUnits(cdfFile)

     # Read a subset of units in a CDF file [~6ms => 0.06ms/unit]
     units1 <- c(5, 100:109, 34)
     cdf1 <- readCdfUnits(cdfFile, units=units1)
     stopifnot(identical(cdf1, cdf0[units1]))
     rm(cdf0)

     # Create a unit name to index map
     names <- readCdfUnitNames(cdfFile)
     units2 <- match(names(cdf1), names)
     stopifnot(identical(units1, units2))
     cdf2 <- readCdfUnits(cdfFile, units=units2)

     stopifnot(identical(cdf1, cdf2))

     # Clean up
     rm(cdfFile, units1, cdf1, names, units2, cdf2)
     } # for (zzz in 0)

