| computeOptimal {ChIPanalyser} | R Documentation |
ChIPanalyser contains a set of functions some of which require two
parameters known as ScalingFactorPWM and as
boundMolecules. These two paramters are not always known.
computeOptimal will compute these values by maximising the
correlation and minimising the Mean Squared Error between a predicted
ChIP-seq-like profile and a real ChIP-seq profile for a given loci.
computeOptimal(DNASequenceSet, genomicProfileParameters, LocusProfile,
setSequence, DNAAccessibility = NULL,
occupancyProfileParameters = NULL, parameter = "all",
peakMethod="moving_kernel",cores=1)
DNASequenceSet |
|
genomicProfileParameters |
|
LocusProfile |
|
setSequence |
|
DNAAccessibility |
|
occupancyProfileParameters |
|
parameter |
|
peakMethod |
|
cores |
|
In order to backward infer the values of ScalingFactorPWM
and boundMolecules, it is possible to use the
computeOptimal to find these parameters.
It should be noted that this functions requires a ChIP-seq data input.
LocusProfile (ChIP-seq data) should be a named list with normalised
ChIP-seq to a single base pair level. Naming should stay consitent with all
other names and should represent the names of the loci of interest.
The naming procedure should be similar in setSequence.
Each range within the GRanges should
be named (not to be confused with seqnames )
computeOptimal returns a list respectivly described as the optimal
set of Parameters (lambda or ScalingFactorPWM and
boundMolecules), the optimal matrix (a matrix containing
accuracy estimates dependant on the parameter chosen), and finally the
chosen parameter. If the parameter that was chosen was "all",
then each element of this list will contain the optimal set of
parameters, optimal matricies for
"correlation", "Mean Squared Error" and "theta".
Patrick C. N. Martin <pm16057@essex.ac.uk>
Zabet NR, Adryan B (2015) Estimating binding properties of transcription factors from genome-wide binding profiles. Nucleic Acids Res., 43, 84–94.
#Data extraction
data(ChIPanalyserData)
# path to Position Frequency Matrix
PFM <- file.path(system.file("extdata",package="ChIPanalyser"),"BCDSlx.pfm")
#As an example of genome, this example will run on the Drosophila genome
if(!require("BSgenome.Dmelanogaster.UCSC.dm3", character.only = TRUE)){
source("https://bioconductor.org/biocLite.R")
biocLite("BSgenome.Dmelanogaster.UCSC.dm3")
}
library(BSgenome.Dmelanogaster.UCSC.dm3)
DNASequenceSet <- getSeq(BSgenome.Dmelanogaster.UCSC.dm3)
#Building data objects
GPP <- genomicProfileParameters(PFM=PFM,BPFrequency=DNASequenceSet)
OPP <- occupancyProfileParameters()
#Computing Optimal set of Parameters
optimalParam <- computeOptimal(DNASequenceSet = DNASequenceSet,
genomicProfileParameters = GPP,
LocusProfile = eveLocusChip,
setSequence = eveLocus,
DNAAccessibility = Access,
occupancyProfileParameters = OPP,
parameter = "all",
peakMethod="moving_kernel")