| controlled_geneset_enrichment {EWCE} | R Documentation |
controlled_geneset_enrichment tests whether a functional geneset is
still enriched in a disease gene set after controlling for the
disease geneset's enrichment in a particular cell type (the 'controlledCT')
controlled_geneset_enrichment( disease_genes, functional_genes, bg_genes, sct_data, annotLevel, reps, controlledCT )
disease_genes |
Array of gene symbols containing the disease gene list. Does not have to be disease genes. Must be from same species as the single cell transcriptome dataset. |
functional_genes |
Array of gene symbols containing the functional gene list. The enrichment of this geneset within the disease_genes is tested. Must be from same species as the single cell transcriptome dataset. |
bg_genes |
Array of gene symbols containing the background gene list. |
sct_data |
List generated using |
annotLevel |
an integer indicating which level of the annotation to analyse. Default = 1. |
reps |
Number of random gene lists to generate (default=100 but should be over 10000 for publication quality results) |
controlledCT |
(optional) If not NULL, and instead is the name of a cell type, then the bootstrapping controls for expression within that cell type |
A list containing three data frames:
p_controlled The probability that functional_genes are
enriched in disease_genes while controlling for the level of specificity
in controlledCT
z_controlled The z-score that functional_genes are enriched
in disease_genes while controlling for the level of specificity in
controlledCT
p_uncontrolled The probability that functional_genes are
enriched in disease_genes WITHOUT controlling for the level of
specificity in controlledCT
z_uncontrolled The z-score that functional_genes are enriched
in disease_genes WITHOUT controlling for the level of specificity in
controlledCT
reps=reps
controlledCT
actualOverlap=actual The number of genes that overlap between
functional and disease gene sets
library(ewceData)
# See the vignette for more detailed explanations
# Gene set enrichment analysis controlling for cell type expression
# set seed for bootstrap reproducibility
set.seed(12345678)
ctd <- ctd()
mouse_to_human_homologs <- mouse_to_human_homologs()
m2h = unique(mouse_to_human_homologs[,c("HGNC.symbol","MGI.symbol")])
schiz_genes <- schiz_genes()
id_genes <- id_genes()
mouse.hits.schiz = unique(m2h[m2h$HGNC.symbol %in% schiz_genes,"MGI.symbol"])
mouse.bg = unique(m2h$MGI.symbol)
hpsd_genes <- hpsd_genes()
mouse.hpsd = unique(m2h[m2h$HGNC.symbol %in% hpsd_genes,"MGI.symbol"])
# Use 3 bootstrap lists for speed, for publishable analysis use >10000
reps=3
res_hpsd_schiz =
controlled_geneset_enrichment(disease_genes=mouse.hits.schiz,
functional_genes = mouse.hpsd,
bg_genes = mouse.bg,
sct_data = ctd, annotLevel = 1,
reps=reps,
controlledCT="pyramidal CA1")