| setup_CPsSearch {InPAS} | R Documentation |
prepare data for predicting cleavage and polyadenylation (CP) sites
setup_CPsSearch(
sqlite_db,
genome,
utr3,
background = c("same_as_long_coverage_threshold", "1K", "5K", "10K", "50K"),
TxDb = NA,
removeScaffolds = FALSE,
hugeData = TRUE,
outdir,
BPPARAM = NULL,
silence = FALSE
)
sqlite_db |
A path to the SQLite database for InPAS, i.e. the output of
|
genome |
An object of BSgenome::BSgenome |
utr3 |
An object of GenomicRanges::GRangesList, output of
|
background |
A character(1) vector, the range for calculating cutoff threshold of local background. It can be "same_as_long_coverage_threshold", "1K", "5K","10K", or "50K". |
TxDb |
an object of GenomicFeatures::TxDb |
removeScaffolds |
A logical(1) vector, whether the scaffolds should be removed from the genome If you use a TxDb containing alternative scaffolds, you'd better to remove the scaffolds. |
hugeData |
A logical(1) vector, indicating whether it is huge data |
outdir |
A character(1) vector, a path with write permission for storing the coverage data. If it doesn't exist, it will be created. |
BPPARAM |
an optional BiocParallel::BiocParallelParam instance determining the parallel back-end to be used during evaluation, or a list of BiocParallelParam instances, to be applied in sequence for nested calls to bplapply. It can be set to NULL or bpparam() |
silence |
report progress or not. By default it doesn't report progress. |
A list as described below:
chromosome-wise 3' UTR coverage in summarized View format
A filename for chr1 3' UTR coverage in summarized View format
A filename for chr2 3' UTR coverage in summarized View format
A filename for chrN 3' UTR coverage in summarized View format
The type of methods for bckground coverage calculation
Z-score cutoff thresholds for each 3' UTRs
A named vector containing depth weight
Jianhong Ou, Haibo Liu
@examples if (interactive()) library(BSgenome.Mmusculus.UCSC.mm10) library("TxDb.Mmusculus.UCSC.mm10.knownGene")
genome <- BSgenome.Mmusculus.UCSC.mm10
TxDb <- TxDb.Mmusculus.UCSC.mm10.knownGene
## load UTR3 annotation and convert it into a GRangesList
data(utr3.mm10)
utr3 <- split(utr3.mm10, seqnames(utr3.mm10))
bedgraphs <- system.file("extdata",c("Baf3.extract.bedgraph",
"UM15.extract.bedgraph"),
package = "InPAS")
tags <- c("Baf3", "UM15")
metadata <- data.frame(tag = tags,
condition = c("Baf3", "UM15"),
bedgraph_file = bedgraphs)
outdir = tempdir()
write.table(metadata, file =file.path(outdir, "metadata.txt"),
sep = "\t", quote = FALSE, row.names = FALSE)
sqlite_db <- setup_sqlitedb(metadata = file.path(outdir,
"metadata.txt"),
outdir)
coverage <- list()
for (i in seq_along(bedgraphs)){
coverage[[tags[i]]] <- get_ssRleCov(bedgraph = bedgraphs[i],
tag = tags[i],
genome = genome,
sqlite_db = sqlite_db,
outdir = outdir,
removeScaffolds = TRUE)
}
coverage_files <- assemble_allCov(sqlite_db,
outdir,
genome,
removeScaffolds = TRUE)
data4CPsitesSearch <- setup_CPsSearch(sqlite_db,
genome,
utr3,
background = "10K",
TxDb = TxDb,
removeScaffolds = TRUE,
hugeData = TRUE,
outdir = outdir)