| apeglm {apeglm} | R Documentation |
apeglm provides Bayesian shrinkage estimators for effect sizes in GLM models, using approximation of the posterior for individual coefficients.
apeglm(Y, x, log.lik, param = NULL, coef = NULL, mle = NULL,
no.shrink = FALSE, interval.type = c("laplace", "HPD", "credible"),
interval.level = 0.95, threshold = NULL, contrasts, weights = NULL,
offset = NULL, flip.sign = TRUE, prior.control, multiplier = 1,
ngrid = 50, nse = 5, ngrid.nuis = 5, nse.nuis = 2, log.link = TRUE,
param.sd = NULL, optim.method = "BFGS", bounds = c(-Inf, Inf))
Y |
the observations, which can be a matrix or SummarizedExperiment,
with columns for samples and rows for "features" (e.g. genes in a genomic context).
If Y is a SummarizedExperiment, |
x |
design matrix, with intercept in the first column |
log.lik |
the log of likelihood function, specified by the user.
For Negative Binomial distribution, user can use |
param |
the other parameter(s) to be used in the likelihood function, e.g. the dispersion parameter for a negative binomial distribution. this can be a vector or a matrix (with columns as parameters) |
coef |
(optional) the index of the coefficient for which to generate posterior estimates (FSR and intervals) |
mle |
(optional) a 2 column matrix giving the MLE and its standard error
of |
no.shrink |
logical, if TRUE, apeglm won't perform shrinkage (default is FALSE) |
interval.type |
(optional) can be "laplace", "HPD", or "credible", which specifies the type of Bayesian interval that the user wants to output; "laplace" represents the Laplace approximation of the posterior mode |
interval.level |
(optional) default is 0.95 |
threshold |
(optional) a threshold for integrating posterior probabilities, see details under 'Value'. Note that this should be on the natural log scale for a log link GLM. |
contrasts |
(optional) contrast matrix, same number of rows as |
weights |
(optional) weights matrix, same shape as |
offset |
(optional) offsets matrix, same shape as |
flip.sign |
whether to flip the sign of threshold value when MAP is negative, default is TRUE (threshold must then be positive) |
prior.control |
see Details |
multiplier |
a positive number, when the prior is adapted to the |
ngrid |
the number of grid points for grid integration of intervals |
nse |
the number of Laplace standard errors to set the left and right edges of the grid around the MAP |
ngrid.nuis |
the number of grid points for nuisance parameters |
nse.nuis |
the number of Laplace standard errors to set the left and right edges of the grid around the MAP of the nuisance parameters |
log.link |
whether the GLM has a log link (default = TRUE) |
param.sd |
(optional) potential uncertainty measure on the parameter |
optim.method |
the method passed to |
bounds |
the bounds for the numeric optimization |
prior.control is a list of parameters that will be passed to determine
the prior distribution. Users are allowed to have a Normal prior on the
intercept, and a t prior on the non-intercept coefficients (similar
to bayesglm in the arm package. The following are defaults:
no.shrink = 1: index of the coefficient(s) not to shrink
prior.mean = 0: mean of t prior
prior.scale = 1: scale of t prior
prior.df = 1: df of t prior
prior.no.shrink.mean = 0: mean of Normal
prior.no.shrink.scale = 15: scale of Normal
So without specifying prior.control, the following is set inside apeglm:
prior.control <- list(no.shrink=1,prior.mean=0,prior.scale=1,
prior.df=1,prior.no.shrink.mean=0,prior.no.shrink.scale=15)
Note that the prior should be defined on the natural log scale for a log link GLM.
a list of matrices containing the following components:
map: a matrix of MAP estimates, columns for coefficients and rows for features
se: a matrix of SE estimates, same shape as map
(note: in future versions of apeglm, this slot is renamed sd).
prior.control: a list with details on the prior
fsr: a vector of the false sign rate for coef
interval: a matrix of either HPD or credible interval for coef
thresh: a vector of the posterior probability that the estimated parameter
is smaller than the threshold value specified in threshold
when MAP is positive (or greater than
-1 * threshold value when MAP is negative and flip.sign is TRUE)
diag: a matrix of diagnostics
contrast.map: a vector of MAP corresponding to the contrast
when contrast is given
contrast.se: a vector of SE corresponding to the contrast
when contrast is given
ranges: a GRanges or GRangesList with the estimated coefficients,
if Y was a SummarizedExperiment.
Note that all parameters associated with coefficients,
e.g. map, se, etc., are returned on the natural log scale for a log link GLM.
# Simulate RNA-Seq read counts data
# 5 samples for each of the two groups
# a total of 100 genes
n.per.group <- 5
n <- n.per.group * 2
m <- 100
# The design matrix includes one column of intercept
# and one column indicating samples that belong to the second group
condition <- factor(rep(letters[1:2], each = n.per.group))
x <- model.matrix(~condition)
# Specify the standard deviation of beta (LFC between groups)
beta.sd <- 2
beta.cond <- rnorm(m, 0, beta.sd)
beta.intercept <- runif(m, 2, 6)
beta.mat <- cbind(beta.intercept, beta.cond)
# Generate the read counts
mu <- exp(t(x %*% t(beta.mat)))
Y <- matrix(rnbinom(m*n, mu=mu, size=1/.1), ncol = n)
# Here we will use the negative binomial log likelihood
# which is an exported function. See 'logLikNB' for details.
# For the NB:
# 'param' is the dispersion estimate (1/size)
# 'offset' can be used to adjust for size factors (log of size factors)
param <- matrix(0.1, nrow = m, ncol = 1)
offset <- matrix(0, nrow = m, ncol = n)
# Shrinkage estimator of betas:
# (for adaptive shrinkage, 'apeglm' requires 'mle' coefficients
# estimated with another software, or by first running 'apeglm'
# setting 'no.shrink=TRUE'.)
res <- apeglm(Y = Y, x = x,
log.lik = logLikNB,
param = param,
offset = offset,
coef = 2)
head(res$map)
plot(beta.mat[,2], res$map[,2])
abline(0,1)