| cluster_signatures {MutationalPatterns} | R Documentation |
Hierarchical clustering of signatures based on cosine similarity
cluster_signatures(signatures, method = "complete")
signatures |
Matrix with 96 trinucleotides (rows) and any number of signatures (columns) |
method |
The agglomeration method to be used for hierarchical clustering. This should be one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). Default = "complete". |
hclust object
## You can download mutational signatures from the COSMIC database:
# sp_url = http://cancer.sanger.ac.uk/cancergenome/assets/signatures_probabilities.txt
# cancer_signatures = read.table(sp_url, sep = "\t", header = T)
## We copied the file into our package for your convenience.
filename <- system.file("extdata/signatures_probabilities.txt",
package="MutationalPatterns")
cancer_signatures <- read.table(filename, sep = "\t", header = TRUE)
## See the 'mut_matrix()' example for how we obtained the mutation matrix:
mut_mat <- readRDS(system.file("states/mut_mat_data.rds",
package="MutationalPatterns"))
## Match the order to MutationalPatterns standard of mutation matrix
order = match(row.names(mut_mat), cancer_signatures$Somatic.Mutation.Type)
## Reorder cancer signatures dataframe
cancer_signatures = cancer_signatures[order,]
## Use trinucletiode changes names as row.names
## row.names(cancer_signatures) = cancer_signatures$Somatic.Mutation.Type
## Keep only 96 contributions of the signatures in matrix
cancer_signatures = as.matrix(cancer_signatures[,4:33])
## Rename signatures to number only
colnames(cancer_signatures) = as.character(1:30)
## Hierarchically cluster the cancer signatures based on cosine similarity
hclust_cancer_signatures = cluster_signatures(cancer_signatures)
## Plot dendrogram
plot(hclust_cancer_signatures)
## Save the signature names in the order of the clustering
sig_order = colnames(cancer_signatures)[hclust_cancer_signatures$order]