| mut_matrix_stranded {MutationalPatterns} | R Documentation |
Make a mutation count matrix with 192 features: 96 trinucleotides and 2 strands, these can be transcription or replication strand
mut_matrix_stranded(vcf_list, ref_genome, ranges, mode = "transcription")
vcf_list |
List of collapsed vcf objects |
ref_genome |
BSGenome reference genome object |
ranges |
GRanges object with the genomic ranges of: 1. (transcription mode) the gene bodies with strand (+/-) information, or 2. (replication mode) the replication strand with 'strand_info' metadata |
mode |
"transcription" or "replication", default = "transcription" |
192 mutation count matrix (96 X 2 strands)
read_vcfs_as_granges,
mut_matrix,
mut_strand
## See the 'read_vcfs_as_granges()' example for how we obtained the
## following data:
vcfs <- readRDS(system.file("states/read_vcfs_as_granges_output.rds",
package="MutationalPatterns"))
## Load the corresponding reference genome.
ref_genome = "BSgenome.Hsapiens.UCSC.hg19"
library(ref_genome, character.only = TRUE)
## Transcription strand analysis:
## You can obtain the known genes from the UCSC hg19 dataset using
## Bioconductor:
# source("https://bioconductor.org/biocLite.R")
# biocLite("TxDb.Hsapiens.UCSC.hg19.knownGene")
# library("TxDb.Hsapiens.UCSC.hg19.knownGene")
## For this example, we preloaded the data for you:
genes_hg19 <- readRDS(system.file("states/genes_hg19.rds",
package="MutationalPatterns"))
mut_mat_s = mut_matrix_stranded(vcfs, ref_genome, genes_hg19,
mode = "transcription")
## Replication strand analysis:
## Read example bed file with replication direction annotation
repli_file = system.file("extdata/ReplicationDirectionRegions.bed",
package = "MutationalPatterns")
repli_strand = read.table(repli_file, header = TRUE)
repli_strand_granges = GRanges(seqnames = repli_strand$Chr,
ranges = IRanges(start = repli_strand$Start + 1,
end = repli_strand$Stop),
strand_info = repli_strand$Class)
## UCSC seqlevelsstyle
seqlevelsStyle(repli_strand_granges) = "UCSC"
# The levels determine the order in which the features
# will be countend and plotted in the downstream analyses
# You can specify your preferred order of the levels:
repli_strand_granges$strand_info = factor(repli_strand_granges$strand_info, levels = c("left", "right"))
mut_mat_s_rep = mut_matrix_stranded(vcfs, ref_genome, repli_strand_granges,
mode = "replication")