| cpm {edgeR} | R Documentation |
Compute counts per million (CPM) or reads per kilobase per million (RPKM).
## S3 method for class 'DGEList'
cpm(y, normalized.lib.sizes = TRUE,
log = FALSE, prior.count = 0.25, ...)
## Default S3 method:
cpm(y, lib.size = NULL,
log = FALSE, prior.count = 0.25, ...)
## S3 method for class 'DGEList'
rpkm(y, gene.length = NULL, normalized.lib.sizes = TRUE,
log = FALSE, prior.count = 0.25, ...)
## Default S3 method:
rpkm(y, gene.length, lib.size = NULL,
log = FALSE, prior.count = 0.25, ...)
## S3 method for class 'DGEList'
cpmByGroup(y, group = NULL, dispersion = NULL, ...)
## Default S3 method:
cpmByGroup(y, group = NULL,
dispersion = 0.05, offset = NULL, weights = NULL, ...)
## S3 method for class 'DGEList'
rpkmByGroup(y, group = NULL, gene.length = NULL, dispersion = NULL, ...)
## Default S3 method:
rpkmByGroup(y, group = NULL, gene.length,
dispersion = 0.05, offset = NULL, weights = NULL, ...)
y |
matrix of counts or a |
normalized.lib.sizes |
logical, use normalized library sizes? |
lib.size |
library size, defaults to |
log |
logical, if |
prior.count |
average count to be added to each observation to avoid taking log of zero. Used only if |
gene.length |
vector of length |
group |
factor giving group membership for columns of |
dispersion |
numeric vector of negative binomial dispersions. |
offset |
numeric matrix of same size as |
weights |
numeric vector or matrix of non-negative quantitative weights.
Can be a vector of length equal to the number of libraries, or a matrix of the same size as |
... |
other arguments are not used. |
CPM or RPKM values are useful descriptive measures for the expression level of a gene.
By default, the normalized library sizes are used in the computation for DGEList objects but simple column sums for matrices.
If log-values are computed, then a small count, given by prior.count but scaled to be proportional to the library size, is added to y to avoid taking the log of zero.
The rpkm method for DGEList objects will try to find the gene lengths in a column of y$genes called Length or length.
Failing that, it will look for any column name containing "length" in any capitalization.
cpmByGroup and rpkmByGroup compute group average values on the unlogged scale.
A numeric matrix of CPM or RPKM values.
cpm and rpkm produce matrices of the same size as y.
cpmByGroup and rpkmByGroup produce matrices with a column for each level of group.
If log = TRUE, then the values are on the log2 scale.
aveLogCPM(y), rowMeans(cpm(y,log=TRUE)) and log2(rowMeans(cpm(y)) all give slightly different results.
Davis McCarthy, Gordon Smyth
y <- matrix(rnbinom(20,size=1,mu=10),5,4) cpm(y) d <- DGEList(counts=y, lib.size=1001:1004) cpm(d) cpm(d,log=TRUE) d$genes <- data.frame(Length=c(1000,2000,500,1500,3000)) rpkm(d)