| srFilter {ShortRead} | R Documentation |
These functions create user-defined (srFitler) or built-in
instances of SRFilter objects. Filters can be
applied to objects from ShortRead, returning a logical vector
to be used to subset the objects to include only those components
satisfying the filter.
srFilter(fun, name = NA_character_, ...)
## S4 method for signature 'missing':
srFilter(fun, name=NA_character_)
## S4 method for signature 'function':
srFilter(fun, name=NA_character_)
compose(filt, ..., .name)
chromosomeFilter(regex=character(0), .name="ChromosomeFilter")
strandFilter(strandLevels=character(0), .name="StrandFilter")
nFilter(threshold=0L, .name="CleanNFilter")
polynFilter(threshold=0L, nuc=c("A", "C", "T", "G", "other"),
.name="PolyNFilter")
srdistanceFilter(subject=character(0), threshold=0L,
.name="SRDistanceFilter")
alignQualityFilter(threshold=0L, .name="AlignQualityFilter")
alignDataFilter(expr=expression(), .name="AlignDataFilter")
fun |
An object of class function to be used as a
filter. fun must accept a single named argument x, and
is expected to return a logical vector such that x[fun(x)]
selects only those elements of x satisfying the conditions of
fun
|
name |
A character(1) object to be used as the name of the
filter. The name is useful for debugging and reference. |
filt |
A SRFilter object, to be used with
additional arugments to create a composite filter. |
.name |
An optional character(1) object used to over-ride
the name applied to default filters. |
regex |
Either character(0) or a character(1)
regular expression used as grep(regex, chromosome(x)) to
filter based on chromosome. The default (character(0))
performs no filtering |
strandLevels |
Either character(0) or character(1)
containing strand levels to be selected. ShortRead objects
have standard strand levels NA, "+", "-", "*", with NA
meaning strand information not available and "*" meaning
strand information not relevant. |
threshold |
A numeric(1) value representing a minimum
(srdistanceFilter, alignQualityFilter) or maximum
(nFilter, polynFilter) criterion for the filter. The
minima and maxima are closed-interval (i.e., x >= threshold,
x <= threshold for some property x of the object being
filtered). |
nuc |
A character vector containing IUPAC symbols for
nucleotides or the value "other" corresponding to all
non-nucleotide symbols, e.g., N. |
subject |
A character() of any length, to be used as the
corresponding argument to srdistance. |
expr |
A expression to be evaluated with
pData(alignData(x)). |
... |
Additional arguments for subsequent methods; these arguments are not currently used. |
srFilter allows users to construct their own filters. The
fun argument to srFilter must be a function accepting a
single argument x and returning a logical vector that can be
used to select elements of x satisfying the filter with
x[fun(x)]
The signature(fun="missing") method creates a default filter
that returns a vector of TRUE values with length equal to
length(x).
compose constructs a new filter from one or more existing
filter. The result is a filter that returns a logical vector with
indicies corresponding to components of x that pass all
filters. If not provided, the name of the filter consists of the names
of all component filters, each separated by " o ".
The remaining functions documented on this page are built-in filters
that accept an argument x and return a logical vector of
length(x) indicating which components of x satisfy the
filter.
chromosomeFilter selects elements satisfying
grep(regex, chromosome(x)).
strandFilter selects elemenst satisfying
match(strand(x), strand, nomatch=0) > 0.
nFilter selects elements with fewer than threshold
'N' symbols in each element of sread(x).
polynFilter selects elements with fewer than threshold
copies of any nucleotide indicated by nuc.
srdistanceFilter selects elements at an edit distance greater
than threshold from all sequences in subject.
alignQualityFilter selects elements with alignQuality(x)
greater than threshold.
alignDataFilter selects elements with
pData(alignData(x)) satisfying expr. expr should
be formulated as though it were to be evaluated as
eval(expr, pData(alignData(x))).
srFilter returns an object of SRFilter.
Built-in filters return a logical vector of length(x), with
TRUE indicating components that pass the filter.
Martin Morgan <mtmorgan@fhcrc.org>
sp <- SolexaPath(system.file("extdata", package="ShortRead"))
aln <- readAligned(sp, "s_2_export.txt") # Solexa export file, as example
# a 'chromosome 5' filter
filt <- chromosomeFilter("chr5.fa")
aln[filt(aln)]
# filter during input
readAligned(sp, "s_2_export.txt", filter=filt)
# x- and y- coordinates stored in alignData, when source is SolexaExport
xy <- alignDataFilter(expression(abs(x-500) > 200 & abs(y-500) > 200))
aln[xy(aln)]
# both filters
chr5xy <- compose(filt, xy)
aln[chr5xy(aln)]
# custom filter: minimum calibrated base call quality >20
goodq <- srFilter(function(x) {
apply(as(quality(x), "matrix"), 1, min) > 20
}, name="GoodQualityBases")
goodq
aln[goodq(aln)]