| text_token {ttgsea} | R Documentation |
An n-gram is used for tokenization. This function can also be used to limit the total number of tokens.
text_token(text, ngram_min = 1, ngram_max = 1, num_tokens)
text |
text data |
ngram_min |
minimum size of an n-gram (default: 1) |
ngram_max |
maximum size of an n-gram (default: 1) |
num_tokens |
maximum number of tokens |
token |
result of tokenizing text |
ngram_min |
minimum size of an n-gram |
ngram_max |
maximum size of an n-gram |
Dongmin Jung
tm::removeWords, stopwords::stopwords, textstem::lemmatize_strings, text2vec::create_vocabulary, text2vec::prune_vocabulary
library(fgsea)
data(examplePathways)
data(exampleRanks)
names(examplePathways) <- gsub("_", " ",
substr(names(examplePathways), 9, 1000))
set.seed(1)
fgseaRes <- fgsea(examplePathways, exampleRanks)
tokens <- text_token(data.frame(fgseaRes)[,"pathway"],
num_tokens = 1000)