matchprobes           package:matchprobes           R Documentation

_A _f_u_n_c_t_i_o_n _t_o _m_a_t_c_h _a _q_u_e_r_y _s_e_q_u_e_n_c_e _t_o _t_h_e _s_e_q_u_e_n_c_e_s _o_f _a _s_e_t _o_f
_p_r_o_b_e_s.

_D_e_s_c_r_i_p_t_i_o_n:

     The 'query' sequence, a character string (probably representing a
     transcript of interest), is scanned for the presence of exact
     matches to the sequences in the character vector 'records'. The
     indices of the set of matches are returned.

_U_s_a_g_e:

     matchprobes(query, records, probepos=FALSE)

_A_r_g_u_m_e_n_t_s:

   query: A character vector. For example, each element may represent a
          gene (transcript) of interest. See Details.

 records: A character vector. For example, each element may represent
          the probes on a DNA array.

probepos: A logical value. If TRUE, return also the start positions of
          the matches in the query sequence.

_D_e_t_a_i_l_s:

     'toupper' is applied to the arguments 'query' and 'records' before
     matching. The intention of this is to make the matching
     case-insensitive. The matching is done using the C library
     function 'strstr'. It might be nice to explore other
     possibilities.

_V_a_l_u_e:

     A list. Its first element is a list of the same length as the
     input vector. Each element of the list is a numeric vector
     containing the indices of the probes that have a perfect match in
     the query sequence.

     If 'probepos' is TRUE, the returned list has a second element: it
     is of the same shape as described above, and gives the respective
     positions of the matches.

_A_u_t_h_o_r(_s):

     R. Gentleman, Laurent Gautier, Wolfgang Huber

_E_x_a_m_p_l_e_s:

       ## The main intention for this function is together with the probe
       ## tables from the "probe" data packages, e.g.:
       ## > library(hgu95av2probe)
       ## > data(probe)
       ## > seq <- probe$sequence
       ##
       ## Since we do not want to be dependent on the presence of this 
       ## data package, for the sake of example we simply simulate some
       ## probe sequences:

       bases <- c("A", "C", "G", "T")
       seq   <- sapply(1:1000, function(x) paste(bases[ceiling(4*runif(256))], collapse=""))

       w1 <- seq[20:22]
       w2 <- complementSeq(w1, start=13, stop=13)
       w  <- c(w1, w2)

       matchprobes(w, seq)
       matchprobes(w, seq, probepos=TRUE)

