GC_content              Calculates GC content percentage for each read
                        in the dataset.
adapter_content         Creates a sorted from most frequent to least
                        frequent abundance table of adapters that are
                        found to be present in the reads at greater
                        than 0.1% of the reads. If output_file is
                        selected then will save the entire set of
                        adapters and counts. Only available for
                        macOS/Linux due to dependency on C++14.
calc_adapter_content    Compute adapter content in reads. This function
                        is only available for macOS/Linux.
calc_format_score       Calculate score based on Illumina format
calc_over_rep_seq       Calculate sequece counts for each unique
                        sequence and create a table with unique
                        sequences and corresponding counts
dimensions              Extract the number of columns and rows for a
                        FASTQ file using seqTools.
find_format             Gets quality score encoding format from the
                        FASTQ file. Return possibilities are
                        Sanger(/Illumina1.8), Solexa(/Illumina1.0),
                        Illumina1.3, and Illumina1.5. This encoding is
                        heuristic based and may not be 100 since there
                        is overlap in the encodings used, so it is best
                        if you already know the format.
gc_per_read             Calculate GC nucleotide sequence content per
                        read of the FASTQ gzipped file
kmer_count              Return kmer count per sequence for the length
                        of kmer desired
overrep_kmer            Generate overrepresented kmers of length k
                        based on their observed to expected ratio at
                        each position across all sequences in the
                        dataset. The expected proportion of a length k
                        kmer assumes site independence and is computed
                        as the sum of the count of each base pair in
                        the kmer times the probability of observing
                        that base pair in the data set, i.e.
                        P(A)count_in_kmer(A)+P(C)count_in_kmer(C)+...
                        The observed to expected ratio is computed as
                        log2(obs/exp). Those with obsexp_ratio > 2 are
                        considered to be overrepresented and appear in
                        the returned data frame along with their
                        position in the sequence.
overrep_reads           Sort all sequences per read by count.
per_base_quality        Compute the mean, median, and percentiles of
                        quality score per base. This is returned as a
                        data frame.
per_read_quality        Compute the mean quality score per read.
                        'per_read_quality'
plot_GC_content         Generate mean GC content histogram.
plot_adapter_content    Creates a bar plot of the top 5 most present
                        adapter sequences.
plot_outliers           Determine how to plot outliers. Heuristic used
                        is whether their obsexp_ratio differs by more
                        than 1 and whether they fall into the same bin
                        or not. If for 2 outliers, obsexp_ratio differs
                        by less than .4 and they are in the same bin,
                        then combine into a single plotting point. NOT
                        FULLY FUNCTIONAL
plot_overrep_kmer       Create a box plot of the
                        log2(observed/expected) ratio across the length
                        of the sequence as well as top overrepresented
                        kmers. Only ratios greater than 2 are included
                        in the box plot. Default is 20 bins across the
                        length of the sequence and the top 2
                        overrepresented kmers, but this can be changed
                        by the user.
plot_overrep_reads      Plot the top 5 seqeunces
plot_per_base_quality   Generate a boxplot of the per position quality
                        score.
plot_per_read_quality   Plot the mean quality score per sequence as a
                        histogram. High quality sequences are those
                        mostly distributed over 30. Low quality
                        sequences are those mostly under 30.
                        'plot_per_read_quality'
plot_read_content       Plot the per position nucleotide content.
plot_read_length        Plot a histogram of the number of reads with
                        each read length.
qual_score_per_read     Calculate the mean quality score per read of
                        the FASTQ gzipped file
read_base_content       Compute nucleotide content per position for a
                        single base pair. Wrapper function around
                        seqTools.
read_content            Compute nucleotide content per position.
                        Wrapper function around seqTools.
read_length             Creates a data frame of read lengths and the
                        number of reads with that read length.
run_all                 Will run all functions in the qckitfastq suite
                        and save the data frames and plots to a
                        user-provided directory. Plot names are
                        supplied by default.
