readXStringColumns         package:ShortRead         R Documentation

_R_e_a_d _o_n_e _o_r _m_o_r_e _c_o_l_u_m_n_s _i_n_t_o _X_S_t_r_i_n_g_S_e_t (_e._g., _D_N_A_S_t_r_i_n_g_S_e_t) _o_b_j_e_c_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function allows short read data components such as DNA
     sequence, quality scores, and read names to be read in to
     'XStringSet' (e.g., 'DNAStringSet', 'BStringSet') objects. One or
     several files of identical layout can be specified.

_U_s_a_g_e:

     readXStringColumns(dirPath, pattern=character(0),
                        colClasses=list(NULL), sep = "\t",
                        header = FALSE, comment.char="#")

_A_r_g_u_m_e_n_t_s:

 dirPath: A character vector giving the directory path (relative or
          absolute) of files to be read.

 pattern: The ('grep'-style) pattern describing file names to be read.
          The default ('character(0)') reads all files in 'dirPath'.
          All files are expected to have identical numbers of columns.

colClasses: A list of length equal to the number of columns in a file.
          Columns with corresponding 'colClasses' equal to 'NULL' are
          ignored. Other entries in 'colClasses' are expected to be
          character strings describing the base class for the
          'XStringSet'. For instance a column of DNA sequences would be
          specified as '"DNAString"'. The column would be parsed into a
          'DNAStringSet' object.

     sep: A length 1 character vector describing the column separator.

  header: A length 1 logical vector indicating whether files include a
          header line identifying columns. If present, the header of
          the first file is used to name the returned values.

comment.char: A length 1 character vector, with a single character
          that, when appearing at the start of a line, indicates that
          the entire line should be ignored. Currently there is no way
          to use comment characters in other than the first position of
          a line.

_V_a_l_u_e:

     A list, with each element containing an 'XStringSet' object of the
     type corresponding to the non-NULL elements of 'colClasses'.

_A_u_t_h_o_r(_s):

     Martin Morgan <mtmorgan@fhcrc.org>

_E_x_a_m_p_l_e_s:

     ## valid character strings for colClasses
     names(slot(getClass("XString"), "subclasses"))

     dirPath <- system.file('extdata', 'maq', package='ShortRead')

     colClasses <- rep(list(NULL), 16)
     colClasses[c(1, 15, 16)] <- c("BString", "DNAString", "BString")

     ## read one file
     readXStringColumns(dirPath, "out.aln.1.txt", colClasses=colClasses)

     ## read all files into a single object for each column
     res <- readXStringColumns(dirPath, colClasses=colClasses)

