| PairSummaries {SynExtend} | R Documentation |
Takes in a LinkedPairs object and gene calls, and returns a pairs list.
PairSummaries(SyntenyLinks,
GeneCalls,
DBPATH,
PIDs = TRUE,
IgnoreDefaultStringSet = FALSE,
Verbose = TRUE,
GapPenalty = TRUE,
TerminalPenalty = TRUE,
Model = "Global",
Correction = "none")
SyntenyLinks |
A |
GeneCalls |
A named list of objects of class “DFrame” built from |
DBPATH |
A SQLite connection object or a character string specifying the path to the database file. Constructed from DECIPHER's |
PIDs |
Logical indicating whether to perform pairwise alignments. If |
IgnoreDefaultStringSet |
Logical indicating alignment type preferences. If |
Verbose |
Logical indicating whether or not to display a progress bar and print the time difference upon completion. |
GapPenalty |
Argument passed to |
TerminalPenalty |
Argument passed to |
Model |
A character string specifying a model to use to identify pairs that are unlikely to be good orthologs. By default this is ”Global”, but two other models are included; ”Local” and ”Exact”, which have minor differences in performance. Alternatively, a user generated model can be used. |
Correction |
Argument to be passed to |
The LinkedPairs object generated by NucleotideOverlap is a container for raw data that describes possible orthologous relationships, however ultimate assignment of orthology is up to user discretion. PairSummaries generates a clear table with relevant statistics for a user to work with as they choose. The option to align all pairs, though onerous can allow users to apply a hard threshold to predictions by PID, while built in models can allow a more succinct and expedient thresholding.
A data.frame with rownames indicating orthologous pairs.
Nicholas Cooley npc19@pitt.edu
DBPATH <- system.file("extdata",
"VignetteSeqs.sqlite",
package = "SynExtend")
# Alternatively, to build a database using DECIPHER:
# DBPATH <- tempfile()
# FNAs <- c("ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/006/740/685/GCA_006740685.1_ASM674068v1/GCA_006740685.1_ASM674068v1_genomic.fna.gz",
# "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/956/175/GCA_000956175.1_ASM95617v1/GCA_000956175.1_ASM95617v1_genomic.fna.gz",
# "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/875/775/GCA_000875775.1_ASM87577v1/GCA_000875775.1_ASM87577v1_genomic.fna.gz")
# for (m1 in seq_along(FNAs)) {
# X <- readDNAStringSet(filepath = FNAs[m1])
# X <- X[order(width(X),
# decreasing = TRUE)]
#
# Seqs2DB(seqs = X,
# type = "XStringSet",
# dbFile = DBPATH,
# identifier = as.character(m1),
# verbose = TRUE)
#}
Syn <- FindSynteny(dbFile = DBPATH)
GeneCalls <- vector(mode = "list",
length = ncol(Syn))
GeneCalls[[1L]] <- gffToDataFrame(GFF = system.file("extdata",
"GCA_006740685.1_ASM674068v1_genomic.gff.gz",
package = "SynExtend"),
Verbose = TRUE)
GeneCalls[[2L]] <- gffToDataFrame(GFF = system.file("extdata",
"GCA_000956175.1_ASM95617v1_genomic.gff.gz",
package = "SynExtend"),
Verbose = TRUE)
GeneCalls[[3L]] <- gffToDataFrame(GFF = system.file("extdata",
"GCA_000875775.1_ASM87577v1_genomic.gff.gz",
package = "SynExtend"),
Verbose = TRUE)
# Alternatively:
# GeneCalls <- vector(mode = "list",
# length = ncol(Syn))
# GeneCalls[[1L]] <- rtracklayer::import(system.file("extdata",
# "GCA_006740685.1_ASM674068v1_genomic.gff.gz",
# package = "SynExtend"))
# GeneCalls[[2L]] <- rtracklayer::import(system.file("extdata",
# "GCA_000956175.1_ASM95617v1_genomic.gff.gz",
# package = "SynExtend"))
# GeneCalls[[3L]] <- rtracklayer::import(system.file("extdata",
# "GCA_000875775.1_ASM87577v1_genomic.gff.gz,
# package = "SynExtend"))
names(GeneCalls) <- seq(length(GeneCalls))
Links <- NucleotideOverlap(SyntenyObject = Syn,
GeneCalls = GeneCalls,
LimitIndex = FALSE,
Verbose = TRUE)
PredictedPairs <- PairSummaries(SyntenyLinks = Links,
GeneCalls = GeneCalls,
DBPATH = DBPATH,
PIDs = FALSE,
Verbose = TRUE,
Model = "Global",
Correction = "none")