| derepFastq {dada2} | R Documentation |
Read in and dereplicate a fastq file.
Description
A custom interface to FastqStreamer
for dereplicating amplicon sequences from fastq or compressed fastq files,
while also controlling peak memory requirement to support large files.
Usage
derepFastq(fls, n = 1e+06, verbose = FALSE, qualityType = "Auto")
Arguments
fls |
(Required). character.
The file path(s) to the fastq file(s), or a directory containing fastq file(s).
Compressed file formats such as .fastq.gz and .fastq.bz2 are supported.
|
n |
(Optional). numeric(1).
The maximum number of records (reads) to parse and dereplicate
at any one time. This controls the peak memory requirement
so that large fastq files are supported.
Default is 1e6, one-million reads.
See FastqStreamer for details on this parameter,
which is passed on.
|
verbose |
(Optional). Default FALSE.
If TRUE, throw standard R messages
on the intermittent and final status of the dereplication.
|
qualityType |
(Optional). character(1).
The quality encoding of the fastq file(s). "Auto" (the default) means to
attempt to auto-detect the encoding. This may fail for PacBio files with
uniformly high quality scores, in which case use "FastqQuality". This
parameter is passed on to readFastq; see
information there for details.
|
Value
A derep-class object or list of such objects.
Examples
# Test that chunk-size, `n`, does not affect the result.
testFastq = system.file("extdata", "sam1F.fastq.gz", package="dada2")
derep1 = derepFastq(testFastq, verbose = TRUE)
derep1.35 = derepFastq(testFastq, 35, TRUE)
all.equal(getUniques(derep1), getUniques(derep1.35)[names(getUniques(derep1))])
[Package
dada2 version 1.16.0
Index]