| simData {muscat} | R Documentation |
Simulation of complex scRNA-seq data
simData( x, n_genes = 500, n_cells = 300, probs = NULL, p_dd = diag(6)[1, ], p_type = 0, lfc = 2, rel_lfc = NULL )
x |
|
n_genes |
# of genes to simulate. |
n_cells |
# of cells to simulate. Either a single numeric or a range to sample from. |
probs |
a list of length 3 containing probabilities of a cell belonging to each cluster, sample, and group, respectively. List elements must be NULL (equal probabilities) or numeric values in [0, 1] that sum to 1. |
p_dd |
numeric vector of length 6. Specifies the probability of a gene being EE, EP, DE, DP, DM, or DB, respectively. |
p_type |
numeric. Probaility of EE/EP gene being a type-gene. If a gene is of class "type" in a given cluster, a unique mean will be used for that gene in the respective cluster. |
lfc |
numeric value to use as mean logFC for DE, DP, DM, and DB type of genes. |
rel_lfc |
numeric vector of relative logFCs for each cluster.
Should be of length |
simData simulates multiple clusters and samples
across 2 experimental conditions from a real scRNA-seq data set.
a SingleCellExperiment
containing multiple clusters & samples across 2 groups.
Helena L Crowell
Crowell, HL, Soneson, C, Germain, P-L, Calini, D, Collin, L, Raposo, C, Malhotra, D & Robinson, MD: On the discovery of population-specific state transitions from multi-sample multi-condition single-cell RNA sequencing data. bioRxiv 713412 (2018). doi: https://doi.org/10.1101/713412
data(sce)
library(SingleCellExperiment)
# prep. SCE for simulation
sce <- prepSim(sce)
# simulate data
(sim <- simData(sce,
n_genes = 100, n_cells = 10,
p_dd = c(0.9, 0, 0.1, 0, 0, 0)))
# simulation metadata
head(gi <- metadata(sim)$gene_info)
# should be ~10% DE
table(gi$category)
# unbalanced sample sizes
sim <- simData(sce,
n_genes = 10, n_cells = 100,
probs = list(NULL, c(0.25, 0.75), NULL))
table(sim$sample_id)
# one group only
sim <- simData(sce,
n_genes = 10, n_cells = 100,
probs = list(NULL, NULL, c(1, 0)))
levels(sim$group_id)