| generate_kmers {transite} | R Documentation |
Counts occurrences of k-mers of length k in the given set of
sequences. Corrects for homopolymeric stretches.
generate_kmers(sequences, k)
sequences |
character vector of DNA or RNA sequences |
k |
length of k-mer, either |
Returns a named numeric vector, where the elements are k-mer counts and the names are DNA k-mers.
generate_kmers always returns DNA k-mers, even if
sequences contains RNA sequences.
RNA sequences are internally converted to DNA sequences. It is not
allowed to mix DNA and
RNA sequences.
Other k-mer functions:
calculate_kmer_enrichment(),
check_kmers(),
compute_kmer_enrichment(),
count_homopolymer_corrected_kmers(),
draw_volcano_plot(),
estimate_significance_core(),
estimate_significance(),
generate_permuted_enrichments(),
run_kmer_spma(),
run_kmer_tsma()
# count hexamers in set of RNA sequences rna_sequences <- c( "CAACAGCCUUAAUU", "CAGUCAAGACUCC", "CUUUGGGGAAU", "UCAUUUUAUUAAA", "AAUUGGUGUCUGGAUACUUCCCUGUACAU", "AUCAAAUUA", "AGAU", "GACACUUAAAGAUCCU", "UAGCAUUAACUUAAUG", "AUGGA", "GAAGAGUGCUCA", "AUAGAC", "AGUUC", "CCAGUAA", "UUAUUUA", "AUCCUUUACA", "UUUUUUU", "UUUCAUCAUU", "CCACACAC", "CUCAUUGGAG", "ACUUUGGGACA", "CAGGUCAGCA" ) hexamer_counts <- generate_kmers(rna_sequences, 6) # count heptamers in set of DNA sequences dna_sequences <- c( "CAACAGCCTTAATT", "CAGTCAAGACTCC", "CTTTGGGGAAT", "TCATTTTATTAAA", "AATTGGTGTCTGGATACTTCCCTGTACAT", "ATCAAATTA", "AGAT", "GACACTTAAAGATCCT", "TAGCATTAACTTAATG", "ATGGA", "GAAGAGTGCTCA", "ATAGAC", "AGTTC", "CCAGTAA", "TTATTTA", "ATCCTTTACA", "TTTTTTT", "TTTCATCATT", "CCACACAC", "CTCATTGGAG", "ACTTTGGGACA", "CAGGTCAGCA" ) hexamer_counts <- generate_kmers(dna_sequences, 7)