| edgeRselection {ClassifyR} | R Documentation |
Performs a differential expression analysis between classes and chooses the features which have best resubstitution performance. The data may have overdispersion and this is modelled.
## S4 method for signature 'matrix'
edgeRselection(counts, classes, ...)
## S4 method for signature 'DataFrame'
edgeRselection(counts, classes, datasetName,
normFactorsOptions = NULL, dispOptions = NULL, fitOptions = NULL,
trainParams, predictParams, resubstituteParams,
selectionName = "edgeR LRT", verbose = 3)
## S4 method for signature 'MultiAssayExperiment'
edgeRselection(counts, targets = NULL, ...)
counts |
Either a |
classes |
A vector of class labels of class |
targets |
If |
... |
Variables not used by the |
datasetName |
A name for the data set used. Stored in the result. |
normFactorsOptions |
A named |
dispOptions |
A named |
fitOptions |
A named |
trainParams |
A container of class |
predictParams |
A container of class |
resubstituteParams |
An object of class |
selectionName |
A name to identify this selection method by. Stored in the result. |
verbose |
Default: 3. A number between 0 and 3 for the amount of progress messages to give. This function only prints progress messages if the value is 3. |
The differential expression analysis follows the standard edgeR
steps of estimating library size normalisation factors, calculating dispersion,
in this case robustly, and then fitting a generalised linear model followed by
a likelihood ratio test.
Data tables which consist entirely of non-numeric data cannot be analysed. If measurements
is an object of class MultiAssayExperiment, the factor of sample classes must be stored
in the DataFrame accessible by the colData function with column name "class".
An object of class SelectResult or a list of such objects, if the classifier which
was used for determining the specified performance metric made a number of prediction varieties.
Dario Strbenac
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Mark D. Robinson, Davis McCarthy, and Gordon Smyth, 2010, Bioinformatics, Volume 26 Issue 1, https://academic.oup.com/bioinformatics/article/26/1/139/182458.
if(require(parathyroidSE) && require(PoiClaClu))
{
data(parathyroidGenesSE)
expression <- assays(parathyroidGenesSE)[[1]]
sampleNames <- paste("Sample", 1:ncol(parathyroidGenesSE))
colnames(expression) <- sampleNames
DPN <- which(colData(parathyroidGenesSE)[, "treatment"] == "DPN")
control <- which(colData(parathyroidGenesSE)[, "treatment"] == "Control")
expression <- expression[, c(control, DPN)]
classes <- factor(rep(c("Contol", "DPN"), c(length(control), length(DPN))))
expression <- expression[rowSums(expression > 1000) > 8, ] # Make small data set.
getClasses <- function(result) result[["ytehat"]]
selected <- edgeRselection(expression, classes, "DPN Treatment",
trainParams = TrainParams(classifyInterface),
predictParams = PredictParams(NULL, getClasses = getClasses),
resubstituteParams = ResubstituteParams(nFeatures = seq(10, 100, 10),
performanceType = "balanced error", better = "lower"))
head(selected@rankedFeatures[[1]])
plotFeatureClasses(expression, classes, "ENSG00000044574",
dotBinWidth = 500, xAxisLabel = "Unnormalised Counts")
}