| mapToGenome-methods {Pbase} | R Documentation |
Map range coordinates between peptide features along proteins and genome (reference) space.
## S4 method for signature 'Proteins,GRangesList' mapToGenome(x, genome, pcol, drop.empty.ranges = TRUE, ...) ## S4 method for signature 'Proteins,GRangesList' pmapToGenome(x, genome, pcol, drop.empty.ranges = TRUE, ...) ## S4 method for signature 'Proteins,EnsDb' mapToGenome(x, genome, pcol, id = "name", idType = "protein_id", drop.empty.ranges = TRUE, ...)
x |
|
genome |
A |
pcol |
character(1) specifying the name of the column in
|
drop.empty.ranges |
|
id |
character(1) indicating which metadata columns in |
idType |
character(1) specifying the type of the IDs found in
|
... |
Additional parameters passed to inner functions. Currently ignored. |
mapToGenome maps the pranges(x) to the ranges of
genome. Unless x and genome are of length 1,
both must be named and items of x are matched to items of
genome using their respective names. Names that do not
co-occur in x and genome are ignored. If we have
seqnames(x): "A", "B" and "C"
and
names(genome): "C", "A", "a",
"z", "A" and "A".
the names of the output will be
"A", "A", "A" and "C".
The output is ordered by (1) seqnames(x) and (2) the order of
the elements in genome.
In case less than length(x) are mapped, as for p["B"]
above, a message informs the user.
mapToGenome,Proteins,EnsDb maps each of the
pranges(x) ranges within the protein sequence to the
corresponding genomic coordinates using annotations provided by the
EnsDb object. To enable the mapping the
Proteins object has to provide IDs that can be used to
identify the encoding transcript. Such IDs can be the Ensembl
protein ID, the Uniprot ID or the Ensembl transcript ID. If a
protein is annotated to multiple transcripts, the function selects
the transcript which CDS length best matches the protein sequence
length.
The mapToGenome,Proteins,EnsDb method maps pranges of
all proteins in the Proteins object to the genome. See
examples below for more details.
pmapToGenome is the element-wise (aka 'parallel')
version of mapToGenome. The i-th pranges(x) is mapped
to the i-th range in genome. x and genome must
have the same length and do not need to be named (names are
ignored).
A named GRangesList object, with names matching
names(genome). For pmapToGenome, the return value will
have the same length as the inputs.
Laurent Gatto, Johannes Rainer
See ?mapToAlignments in the
GenomicAlignments package for mapping coordinates between
reads (local) and genome (reference) space using a CIGAR
alignment.
See ?mapToTranscripts in the
GenomicRanges package for mapping coordinates between features
in the transcriptome and genome space.
The proteinCoding function to remove non-protein
coding ranges before mapping peptides to their genomic coordinates.
The mapping vignette for examples and visualisations.
See plotAsAnnotationTrack and
plotAsAnnotationTrack for more details about the two
plotting functions.
data(p)
grl <- etrid2grl(acols(p)$ENST)
pcgrl <- proteinCoding(grl)
plotAsGeneRegionTrack(grl[[1]],
pcgrl[[1]])
mp <- mapToGenome(p[4], pcgrl[4])
plotAsAnnotationTrack(mp[[1]], pcgrl[[4]])
pmapToGenome(p, pcgrl)
#######
## mapToGenome,Proteins,EnsDb
## load an EnsDb object providing the required annotations
library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86
## Map the pranges of all proteins in p to the genome providing the proteins'
## Uniprot IDs (being the 'names' of the Proteins object) for the mapping.
mp <- mapToGenome(p, edb, id = "name", idType = "uniprot_id")