readCdfUnits           package:affxparser           R Documentation

_R_e_a_d_s _u_n_i_t_s (_p_r_o_b_e_s_e_t_s) _f_r_o_m _a_n _A_f_f_y_m_e_t_r_i_x _C_D_F _f_i_l_e

_D_e_s_c_r_i_p_t_i_o_n:

     Reads units (probesets) from an Affymetrix CDF file. Gets all or a
     subset of units (probesets).

_U_s_a_g_e:

     readCdfUnits(filename, units=NULL, readXY=TRUE, readBases=TRUE, readExpos=TRUE, readType=TRUE, readDirection=TRUE, stratifyBy=c("nothing", "pmmm", "pm", "mm"), readIndices=FALSE, verbose=0)

_A_r_g_u_m_e_n_t_s:

filename: The filename of the CDF file.

   units: An 'integer' 'vector' of unit indices specifying which units
          to be read.  If 'NULL', all units are read.

  readXY: If 'TRUE', cell row and column (x,y) coordinates are
          retrieved, otherwise not.

readBases: If 'TRUE', cell P and T bases are retrieved, otherwise not.

readExpos: If 'TRUE', cell "expos" values are retrieved, otherwise not.

readType: If 'TRUE', unit types are retrieved, otherwise not.

readDirection: If 'TRUE', unit _and_ group directions are retrieved,
          otherwise not.

stratifyBy: A 'character' string specifying which and how elements in
          group fields are returned. If '"nothing"', elements are
          returned as is, i.e. as 'vector's. If '"pm"'/'"mm"', only
          elements corresponding to perfect-match (PM) / mismatch (MM)
          probes are returned (as 'vector's). If '"pmmm"', elements are
          returned as a matrix where the first row holds elements
          corresponding to PM probes and the second corresponding to MM
          probes.  Note that in this case, it is assumed that there are
          equal number of PMs and MMs; if not, an error is generated.
          Moreover, the PMs and MMs may not even be paired, i.e. there
          is no guarantee that the two elements in a column corresponds
          to a PM-MM pair.

readIndices: If 'TRUE', cell indices calculated from the row and column
          (X,Y) coordinates are retrieved, otherwise not.

 verbose: An 'integer' specifying the verbose level. If 0, the file is
          parsed quietly.  The higher numbers, the more details.

_V_a_l_u_e:

     A named 'list' where the names corresponds to the names of the
     units read.  Each element of the list is in turn a 'list'
     structure with three components: 

  groups: A 'list' with one component for each group (also called
          block). The information on each group is a 'list' of up to
          seven components: 'x', 'y', 'pbase', 'tbase', 'expos',
          'indices', and 'direction'. All fields but the latter have
          the same number of values as there are cells in the group. 
          The latter field has only one value indicating the direction
          for the whole group. 

    type: An 'integer' specifying the type of the unit, where 1 is
          "expression", 2 is "genotyping", 3 is "CustomSeq", and 4
          "tag".

direction: An 'integer' specifying the direction of the unit, which
          defines if the probes are interrogating the sense or the
          anti-sense target, where 0 is "no direction", 1 is "sense",
          and 2 is "anti-sense".

_A_u_t_h_o_r(_s):

     James Bullard, bullard@stat.berkeley.edu and Kasper Daniel Hansen,
     khansen@stat.berkeley.edu. Modified by Henrik Bengtsson (<URL:
     http://www.braju.com/R/>) to read any subset of units and/or
     subset of parameters, to stratify by PM/MM, and to return cell
     indices.d

_R_e_f_e_r_e_n_c_e_s:

     [1] Affymetrix Inc, Affymetrix GCOS 1.x compatible file formats,
     June 14, 2005. <URL: http://www.affymetrix.com/support/developer/>

_S_e_e _A_l_s_o:

     'readCdfCellIndices'().

_E_x_a_m_p_l_e_s:

     ##############################################################
     if (require("AffymetrixDataTestFiles")) {            # START #
     ##############################################################

     # Find any CDF file
     cdfFile <- findCdf()

     # Read all units in a CDF file [~20s => 0.34ms/unit]
     cdf0 <- readCdfUnits(cdfFile, readXY=FALSE, readExpos=FALSE)

     # Read a subset of units in a CDF file [~6ms => 0.06ms/unit]
     units1 <- c(5, 100:109, 34)
     cdf1 <- readCdfUnits(cdfFile, units=units1, readXY=FALSE, readExpos=FALSE)
     stopifnot(identical(cdf1, cdf0[units1]))
     rm(cdf0)

     # Create a unit name to index map
     names <- readCdfUnitNames(cdfFile)
     units2 <- match(names(cdf1), names)
     stopifnot(all.equal(units1, units2))
     cdf2 <- readCdfUnits(cdfFile, units=units2, readXY=FALSE, readExpos=FALSE)

     stopifnot(identical(cdf1, cdf2))

     ##############################################################
     }                                                     # STOP #
     ##############################################################

