| agglomerate-methods {mia} | R Documentation |
agglomerateByRank can be used to sum up data based on the association
to certain taxonomic ranks given as rowData. Only available
taxonomicRanks can be used.
## S4 method for signature 'SummarizedExperiment' agglomerateByRank( x, rank = taxonomyRanks(x)[1], onRankOnly = FALSE, na.rm = FALSE, empty.fields = c(NA, "", " ", "\t", "-", "_"), ... ) ## S4 method for signature 'SingleCellExperiment' agglomerateByRank(x, ..., altexp = NULL, strip_altexp = TRUE) ## S4 method for signature 'TreeSummarizedExperiment' agglomerateByRank(x, ..., agglomerateTree = FALSE)
x |
a
|
rank |
a single character defining a taxonomic rank. Must be a value of
|
onRankOnly |
|
na.rm |
|
empty.fields |
a |
... |
arguments passed to
|
altexp |
String or integer scalar specifying an alternative experiment containing the input data. |
strip_altexp |
|
agglomerateTree |
|
Based on the available taxonomic data and its structure setting
onRankOnly = TRUE has certain implications on the interpretability of
your results. If no loops exist (loops meaning two higher ranks containing
the same lower rank), the results should be comparable. you can check for
loops using detectLoop.
Agglomeration sum up values of assays at specified taxonomic level. Certain assays, e.g. those that include binary or negative values, can lead to meaningless values, when values are summed. In those cases, consider doing agglomeration first and then transformation.
A taxonomically-agglomerated, optionally-pruned object of the same
class as x.
mergeRows,
sumCountsAcrossFeatures
data(GlobalPatterns)
# print the available taxonomic ranks
colnames(rowData(GlobalPatterns))
taxonomyRanks(GlobalPatterns)
# agglomerate at the Family taxonomic rank
x1 <- agglomerateByRank(GlobalPatterns, rank="Family")
## How many taxa before/after agglomeration?
nrow(GlobalPatterns)
nrow(x1)
# with agglomeration of the tree
x2 <- agglomerateByRank(GlobalPatterns, rank="Family",
agglomerateTree = TRUE)
nrow(x2) # same number of rows, but
rowTree(x1) # ... different
rowTree(x2) # ... tree
# If assay contains binary or negative values, summing might lead to meaningless
# values, and you will get a warning. In these cases, you might want to do
# agglomeration again at chosen taxonomic level.
tse <- transformSamples(GlobalPatterns, method = "pa")
tse <- agglomerateByRank(tse, rank = "Genus")
tse <- transformSamples(tse, method = "pa")
# removing empty labels by setting na.rm = TRUE
sum(is.na(rowData(GlobalPatterns)$Family))
x3 <- agglomerateByRank(GlobalPatterns, rank="Family", na.rm = TRUE)
nrow(x3) # different from x2
# Because all the rownames are from the same rank, rownames do not include
# prefixes, in this case "Family:".
print(rownames(x3[1:3,]))
# To add them, use getTaxonomyLabels function.
rownames(x3) <- getTaxonomyLabels(x3, with_rank = TRUE)
print(rownames(x3[1:3,]))
# use 'remove_empty_ranks' to remove columns that include only NAs
x4 <- agglomerateByRank(GlobalPatterns, rank="Phylum", remove_empty_ranks = TRUE)
head(rowData(x4))
## Look at enterotype dataset...
data(enterotype)
## print the available taxonomic ranks. Shows only 1 rank available
## not useful for agglomerateByRank
taxonomyRanks(enterotype)