| svaseq {sva} | R Documentation |
This function is the implementation of the iteratively re-weighted least squares
approach for estimating surrogate variables. As a by product, this function
produces estimates of the probability of being an empirical control. This function first
applies a moderated log transform as described in Leek 2014 before calculating the surrogate
variables. See the function empirical.controls for a direct estimate of the empirical controls.
svaseq(
dat,
mod,
mod0 = NULL,
n.sv = NULL,
controls = NULL,
method = c("irw", "two-step", "supervised"),
vfilter = NULL,
B = 5,
numSVmethod = "be",
constant = 1
)
dat |
The transformed data matrix with the variables in rows and samples in columns |
mod |
The model matrix being used to fit the data |
mod0 |
The null model being compared when fitting the data |
n.sv |
The number of surogate variables to estimate |
controls |
A vector of probabilities (between 0 and 1, inclusive) that each gene is a control. A value of 1 means the gene is certainly a control and a value of 0 means the gene is certainly not a control. |
method |
For empirical estimation of control probes use "irw". If control probes are known use "supervised" |
vfilter |
You may choose to filter to the vfilter most variable rows before performing the analysis. vfilter must be NULL if method is "supervised" |
B |
The number of iterations of the irwsva algorithm to perform |
numSVmethod |
If n.sv is NULL, sva will attempt to estimate the number of needed surrogate variables. This should not be adapted by the user unless they are an expert. |
constant |
The function takes log(dat + constant) before performing sva. By default constant = 1, all values of dat + constant should be positive. |
sv The estimated surrogate variables, one in each column
pprob.gam: A vector of the posterior probabilities each gene is affected by heterogeneity
pprob.b A vector of the posterior probabilities each gene is affected by mod
n.sv The number of significant surrogate variables
library(zebrafishRNASeq)
data(zfGenes)
filter = apply(zfGenes, 1, function(x) length(x[x>5])>=2)
filtered = zfGenes[filter,]
genes = rownames(filtered)[grep("^ENS", rownames(filtered))]
controls = grepl("^ERCC", rownames(filtered))
group = as.factor(rep(c("Ctl", "Trt"), each=3))
dat0 = as.matrix(filtered)
mod1 = model.matrix(~group)
mod0 = cbind(mod1[,1])
svseq = svaseq(dat0,mod1,mod0,n.sv=1)$sv
plot(svseq,pch=19,col="blue")