| profileAccuracyEstimate {ChIPanalyser} | R Documentation |
profileAccuracyEstimate will compare the predicted ChIP-seq-like
profile to real ChIP-seq data and return a set of metrics describing how
accurate the predicted model is compared to real data.
profileAccuracyEstimate(LocusProfile, predictedProfile,
occupancyProfileParameters = NULL)
LocusProfile |
|
predictedProfile |
|
occupancyProfileParameters |
|
The accuracy of the predicted profile may be estimated by measuring
corraltion, Mean Squared Error and theta (in house metric based on a
modified ratio of correlation over MSE) between predicted Profiles and
real ChIP-seq data. Actual ChIP-seq profiles should be normalised to a
base pair level (Enrichement divded by the width of the range for that
given score - the end result is a numeric vector of length equals to the
length of the locus in base pairs). It should be noted that if an
occupancyProfileParameters object is not supplied,
then one will be created internally. However, we strongly advise to use
the same occupancyProfileParameters object used previously.
Returns a list of lists. Each element in the list represents a combination
of lambda (see ScalingFactorPWM) and bound molecules
(see boundMolecules) and the list within each element is
he list of Loci of interest. Finally, at the core of these lists is a
named vector containing correlation and MSE for the given Loci but also
meanCorr, meanMSE and meanTheta for all loci for a given combination of
Lambda and bound molecules.
Patrick C. N. Martin <pm16057@essex.ac.uk>
Zabet NR, Adryan B (2015) Estimating binding properties of transcription factors from genome-wide binding profiles. Nucleic Acids Res., 43, 84–94.
#Data extraction
data(ChIPanalyserData)
# path to Position Frequency Matrix
PFM <- file.path(system.file("extdata",package="ChIPanalyser"),"BCDSlx.pfm")
#As an example of genome, this example will run on the Drosophila genome
if(!require("BSgenome.Dmelanogaster.UCSC.dm3", character.only = TRUE)){
source("https://bioconductor.org/biocLite.R")
biocLite("BSgenome.Dmelanogaster.UCSC.dm3")
}
library(BSgenome.Dmelanogaster.UCSC.dm3)
DNASequenceSet <- getSeq(BSgenome.Dmelanogaster.UCSC.dm3)
#Building data objects
GPP <- genomicProfileParameters(PFM=PFM,BPFrequency=DNASequenceSet)
OPP <- occupancyProfileParameters()
# Computing Genome Wide
GenomeWide <- computeGenomeWidePWMScore(DNASequenceSet = DNASequenceSet,
genomicProfileParameters = GPP)
#Compute PWM Scores
PWMScores <- computePWMScore(DNASequenceSet = DNASequenceSet,
genomicProfileParameters = GenomeWide,
setSequence = eveLocus, DNAAccessibility = Access)
#Compute Occupnacy
Occupancy <- computeOccupancy(AllSitesPWMScore = PWMScores,
occupancyProfileParameters = OPP)
#Compute ChIP profiles
chipProfile <- computeChipProfile(setSequence = eveLocus,
occupancy = Occupancy,
occupancyProfileParameters = OPP)
#Estimating accuracy estimate
AccuracyEstimate <- profileAccuracyEstimate(LocusProfile = eveLocusChip,
predictedProfile = chipProfile,
occupancyProfileParameters = OPP)