| test_gene_overrepresentation {tidybulk} | R Documentation |
test_gene_overrepresentation() takes as imput a 'tbl' formatted as | <SAMPLE> | <ENSEMBL_ID> | <COUNT> | <...> | and returns a 'tbl' with the GSEA statistics
test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species) ## S4 method for signature 'spec_tbl_df' test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species) ## S4 method for signature 'tbl_df' test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species) ## S4 method for signature 'tidybulk' test_gene_overrepresentation(.data, .sample = NULL, .entrez, .do_test, species)
.data |
A 'tbl' formatted as | <SAMPLE> | <TRANSCRIPT> | <COUNT> | <...> | |
.sample |
The name of the sample column |
.entrez |
The ENTREZ ID of the transcripts/genes |
.do_test |
A boolean column name symbol. It indicates the transcript to check |
species |
A character. For example, human or mouse. MSigDB uses the latin species names (e.g., \"Mus musculus\", \"Homo sapiens\") |
This wrapper execute gene enrichment analyses of the dataset using a list of transcripts and GSEA. This wrapper uses clusterProfiler on the backend.
A 'tbl' object
A 'tbl' object
A 'tbl' object
A 'tbl' object
df_entrez = symbol_to_entrez(tidybulk::counts_mini, .transcript = transcript, .sample = sample)
df_entrez = aggregate_duplicates(df_entrez, aggregation_function = sum, .sample = sample, .transcript = entrez, .abundance = count)
df_entrez = mutate(df_entrez, do_test = transcript %in% c("TNFRSF4", "PLCH2", "PADI4", "PAX7"))
test_gene_overrepresentation(
df_entrez,
.sample = sample,
.entrez = entrez,
.do_test = do_test,
species="Homo sapiens"
)