pairwiseAlignment         package:Biostrings         R Documentation

_O_p_t_i_m_a_l _P_a_i_r_w_i_s_e _A_l_i_g_n_m_e_n_t

_D_e_s_c_r_i_p_t_i_o_n:

     Solves the (Needleman-Wunsch) global alignment and
     (Smith-Waterman) local alignment problems.

_U_s_a_g_e:

     pairwiseAlignment(pattern, subject, patternQuality = 22L, subjectQuality = 22L, type = "global",
                       substitutionMatrix = NULL, gapOpening = -10, gapExtension = -4,
                       scoreOnly = FALSE)

_A_r_g_u_m_e_n_t_s:

 pattern: a character vector of length 1, an 'XString', or an
          'XStringSet' object.

 subject: a character vector of length 1 or an 'XString' object.

patternQuality, subjectQuality: respective quality scores for 'pattern'
          and 'subject' that are used in a quality-based method for
          generating a substitution matrix. These scores must either be
          represented by [0 - 99] integer vectors, character vectors,
          'BString', or, in the case of 'patternQuality', 'BStringSet'
          objects. Characters are interpreted as [0 - 99] quality
          measures by subtracting 33 from their ASCII decimal
          representation (e.g. ! = 0, " = 1, # = 2, ...). These two
          arguments are ignored if '!is.null(substitutionMatrix)'.

    type: type of alignment ('"global"', '"local"', '"overlap"').

substitutionMatrix: constant substitution matrix for the alignment. Do
          not use 'substitutionMatrix' in conjunction with
          'patternQuality' and 'subjectQuality' arguments.

gapOpening: penalty for opening a gap in the alignment.

gapExtension: penalty for extending a gap in the alignment.

scoreOnly: logical to denote whether or not to only return the scores
          of the optimal pairwise alignment. (See Value section below.)

_D_e_t_a_i_l_s:

     General implementation based on Chapter 2 of Haubold and Wiehe
     (2006). Quality-based method for generating a substitution matrix
     based on the Bioinformatics article by Ketil Malde given below.

_V_a_l_u_e:

     If 'scoreOnly == FALSE', an instance of class 'XStringAlign' is
     returned. If 'scoreOnly == TRUE', a numeric vector containing the
     scores for the optimal pairwise alignments is returned.

_A_u_t_h_o_r(_s):

     Patrick Aboyoun and Herve Pages.

_R_e_f_e_r_e_n_c_e_s:

     B. Haubold, T. Wiehe, Introduction to Computational Biology,
     Birkhauser Verlag 2006, Chapter 2. K. Malde, The effect of
     sequence quality on sequence alignment, Bioinformatics, Feb 23,
     2008.

_S_e_e _A_l_s_o:

     XStringAlign-class, substitution.matrices

_E_x_a_m_p_l_e_s:

       ## Nucleotide global, local, and overlap alignments
       s1 <- 
         DNAString("ACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGTTTTCAAG")
       s2 <-
         DNAString("GTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATC")

       # First use a constant substitution matrix
       mat <- matrix(-3, nrow = 4, ncol = 4)
       diag(mat) <- 1
       rownames(mat) <- colnames(mat) <- DNA_ALPHABET[1:4]
       globalAlign <-
         pairwiseAlignment(s1, s2, substitutionMatrix = mat, gapOpening = -5, gapExtension = -2)
       localAlign <-
         pairwiseAlignment(s1, s2, type = "local", substitutionMatrix = mat, gapOpening = -5, gapExtension = -2)
       overlapAlign <-
         pairwiseAlignment(s1, s2, type = "overlap", substitutionMatrix = mat, gapOpening = -5, gapExtension = -2)

       # Then use quality-based method for generating a substitution matrix
       pairwiseAlignment(s1, s2,
                         patternQuality = rep(c(22L, 12L), times = c(36, 18)),
                         subjectQuality = rep(c(22L, 12L), times = c(40, 20)),
                         scoreOnly = TRUE)

       ## Amino acid global alignment
       pairwiseAlignment(AAString("PAWHEAE"), AAString("HEAGAWGHEE"), substitutionMatrix = "BLOSUM50",
                         gapOpening = 0, gapExtension = -8)

