reverseComplement         package:Biostrings         R Documentation

_S_e_q_u_e_n_c_e _r_e_v_e_r_s_i_n_g _a_n_d _c_o_m_p_l_e_m_e_n_t_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     These functions can reverse a BString, DNAString or RNAString
     object and complement each base of a DNAString object.

_U_s_a_g_e:

       reverse(x, ...)
       complement(x, ...)
       reverseComplement(x, ...)

_A_r_g_u_m_e_n_t_s:

       x: A 'BString' (or derived) object or a 'BStringViews' object
          for 'reverse'. A 'DNAString' object or a 'BStringViews'
          object with a 'DNAString' subject for 'complement' and
          'reverseComplement'. 

     ...: Additional arguments to be passed to or from methods. 

_D_e_t_a_i_l_s:

     Given an object 'x' of class BString, DNAString or RNAString,
     'reverse(x)' returns an object of the same class where letters in
     'x' are reordered in the reverse ordered. If 'x' is a DNAString
     object, 'complement(x)' returns an object where each base in 'x'
     is "complemented" i.e. A, C, G, T are replaced by T, G, C, A
     respectively. Letters belonging to the "IUPAC extended genetic
     alphabet" are also replaced by their complement (M <-> K, R <-> Y,
     S <-> S, V <-> B, W <-> W, H <-> D, N <-> N) and the gap symbol
     (-) is unchanged. 'reverseComplement(x)' is equivalent to
     'reverse(complement(x))' but is faster and more memory efficient.

_V_a_l_u_e:

     An object of the same class and length as the original object.

_S_e_e _A_l_s_o:

     'findPalindromes'

_E_x_a_m_p_l_e_s:

       reverseComplement(DNAString("ACGT-YN-"))

       ## Applying reverseComplement() to the pattern before calling matchPattern()
       ## is the standard way to search hits on the reverse strand of a chromosome:
       library(BSgenome.Dmelanogaster.FlyBase.r51)
       chrX <- Dmelanogaster[["X"]]
       pattern <- DNAString("GAACGGTGTCT")
       matchPattern(pattern, chrX) # 1 hit on strand +
       m0 <- matchPattern(reverseComplement(pattern), chrX) # 2 hits on strand -

       ## Applying reverseComplement() to the subject instead of the pattern is not
       ## a good idea for 2 reasons:
       ## (1) Chromosome sequences are generally huge so it's going to be a lot of
       ##     work and require a lot of memory to compute reverseComplement(subject).
       ## (2) Chromosome locations are generally given relatively to the positive
       ##     strand, even for features located in the negative strand, so after
       ##     doing this:
       m1 <- matchPattern(pattern, reverseComplement(chrX))
       ##     the start/end of the matches are now relative to the negative strand.
       ##     You need to apply reverseComplement() again on the result if you want
       ##     them to be relative to the positive strand:
       m2 <- reverseComplement(m1)
       ##     and finally to apply rev() to sort the matches from left to right
       ##     (5'3' direction) like in m0:
       m3 <- rev(m2) # same as m0, finally!

       ## Don't try the above example on human chromosome 1 since your computer
       ## would need to allocate about 250Mb of memory for this:
       if (FALSE) {
         library(BSgenome.Hsapiens.UCSC.hg18)
         chr1 <- Hsapiens$chr1
         matchPattern(pattern, reverseComplement(chr1)) # DON'T DO THIS!
         matchPattern(reverseComplement(pattern), chr1) # DO THIS INSTEAD
       }

