| align-utils {Biostrings} | R Documentation |
A variety of different functions used to deal with sequence alignments.
mismatchTable(x, shiftLeft=0L, shiftRight=0L, ...) mismatchSummary(x, ...) ## S4 method for signature 'AlignedXStringSet': coverage(x, start=NA, end=NA, weight=1L) ## S4 method for signature 'PairwiseAlignedFixedSubject': coverage(x, start=NA, end=NA, weight=1L) compareStrings(pattern, subject) ## S4 method for signature 'character': consensusMatrix(x, freq=FALSE) ## S4 method for signature 'XStringSet': consensusMatrix(x, baseOnly=FALSE, freq=FALSE) consensusString(x)
x |
A character vector or matrix, XStringSet, XStringViews,
PairwiseAlignedFixedSubject, or list of FASTA records containing the equal-length
strings.
|
shiftLeft, shiftRight |
Non-positive and non-negative integers respectively that specify how many preceding and succeeding characters to and from the mismatch position to include in the mismatch substrings. |
... |
Further arguments to be passed to or from other methods. |
start, end |
See ?coverage.
|
weight |
An integer vector specifying how much each element in x counts.
|
pattern, subject |
The strings to compare. Can be of type character, XString,
XStringSet, AlignedXStringSet, or, in the case of
pattern, PairwiseAlignedFixedSubject. If pattern is a
PairwiseAlignedFixedSubject object, then subject must be missing.
|
baseOnly |
TRUE or FALSE.
If TRUE, the returned vector only contains frequencies for the
letters in the "base" alphabet i.e. "A", "C", "G", "T" if x
is a "DNA input", and "A", "C", "G", "U" if x is "RNA input".
When x is a BString object (or an XStringViews
object with a BString subject, or a BStringSet object),
then the baseOnly argument is ignored.
|
freq |
If TRUE, then letter frequencies (per position) are reported, otherwise counts.
|
mismatchTable: a data.frame containing the positions and substrings
of the mismatches for the AlignedXStringSet or PairwiseAlignedFixedSubject
object.
mismatchSummary: a list of data.frame objects containing counts and
frequencies of the mismatches for the AlignedXStringSet or
PairwiseAlignedFixedSubject object.
compareStrings combines two equal-length strings that are assumed to be aligned
into a single character string containing that replaces mismatches with "?",
insertions with "+", and deletions with "-".
consensusMatrix computes a consensus matrix for a set of equal-length strings that
are assumed to be aligned.
consensusString creates the string based on a 50% + 1 vote from the consensus
matrix with unknowns labeled with "?".
pairwiseAlignment,
XString-class, XStringSet-class, XStringViews-class,
AlignedXStringSet-class, PairwiseAlignedFixedSubject-class,
match-utils
## Compare two globally aligned strings
string1 <- "ACTTCACCAGCTCCCTGGCGGTAAGTTGATC---AAAGG---AAACGCAAAGTTTTCAAG"
string2 <- "GTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATC"
compareStrings(string1, string2)
## Create a consensus matrix
nw1 <-
pairwiseAlignment(AAStringSet(c("HLDNLKGTF", "HVDDMPNAL")), AAString("SMDDTEKMSMKL"),
substitutionMatrix = "BLOSUM50", gapOpening = -3, gapExtension = -1)
consensusMatrix(nw1)
## Examine the consensus between the bacteriophage phi X174 genomes
data(phiX174Phage)
phageConsmat <- consensusMatrix(phiX174Phage, baseOnly = TRUE)
phageDiffs <- which(apply(phageConsmat, 2, max) < length(phiX174Phage))
phageDiffs
phageConsmat[,phageDiffs]
## Read in ORF data
file <- system.file("extdata", "someORF.fa", package="Biostrings")
orf <- read.DNAStringSet(file, "fasta")
## To illustrate, the following example assumes the ORF data
## to be aligned for the first 10 positions (patently false):
orf10 <- DNAStringSet(orf, end=10)
consensusMatrix(orf10, baseOnly=TRUE, freq=TRUE)
consensusString(sort(orf10)[1:5])
## For the character matrix containing the "exploded" representation
## of the strings, do:
as.matrix(orf10, use.names=FALSE)