| write10xCounts {DropletUtils} | R Documentation |
Create a directory containing the count matrix and cell/gene annotation from a sparse matrix of UMI counts, in the format produced by the CellRanger software suite.
write10xCounts(path, x, barcodes=colnames(x), gene.id=rownames(x),
gene.symbol=gene.id, gene.type="Gene Expression", overwrite=FALSE,
type=c("auto", "sparse", "HDF5"), genome="unknown", version=c("2", "3"))
x |
A sparse numeric matrix of UMI counts. |
path |
A string containing the path to the output directory. |
barcodes |
A character vector of cell barcodes, one per column of |
gene.id |
A character vector of gene identifiers, one per row of |
gene.symbol |
A character vector of gene symbols, one per row of |
gene.type |
A character vector of gene types, expanded to one per row of |
overwrite |
A logical scalar specifying whether |
type |
String specifying the type of 10X format to save |
genome |
String specifying the genome for storage when |
version |
String specifying the version of the CellRanger format to produce. |
This function will try to automatically detect the desired format based on whether path ends with ".h5".
If so, it assumes that path specifies a HDF5 file path and sets type="HDF5".
Otherwise it will set type="sparse" under the assumption that path specifies a path to a directory.
Note that there were major changes in the output format for CellRanger version 3.0, to account for non-gene features such as antibody or CRISPR tags.
Users can switch to this new format using version="3".
See the documentation for “latest” for this new format, otherwise see “2.2” or earlier.
For type="sparse", a directory is produced at path.
If version="2", this will contain the files "matrix.mtx", "barcodes.tsv" and "genes.tsv".
If version="3", it will instead contain "matrix.mtx.gz", "barcodes.tsv.gz" and "features.tsv.gz".
For type="HDF5", a HDF5 file is produced at path containing data in column-sparse format.
If version="2", data are stored in the HDF5 group named genome.
If version="3", data are stored in the group "matrix".
A TRUE value is invisibly returned.
Aaron Lun
10X Genomics (2017). Gene-Barcode Matrices. https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/2.2/output/matrices
10X Genomics (2018). Feature-Barcode Matrices. https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/matrices
10X Genomics (2018). HDF5 Gene-Barcode Matrix Format. https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/2.2/advanced/h5_matrices
10X Genomics (2018). HDF5 Feature Barcode Matrix Format. https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/h5_matrices
# Mocking up some count data.
library(Matrix)
my.counts <- matrix(rpois(1000, lambda=5), ncol=10, nrow=100)
my.counts <- as(my.counts, "dgCMatrix")
cell.ids <- paste0("BARCODE-", seq_len(ncol(my.counts)))
ngenes <- nrow(my.counts)
gene.ids <- paste0("ENSG0000", seq_len(ngenes))
gene.symb <- paste0("GENE", seq_len(ngenes))
# Writing this to file:
tmpdir <- tempfile()
write10xCounts(tmpdir, my.counts, gene.id=gene.ids,
gene.symbol=gene.symb, barcodes=cell.ids)
list.files(tmpdir)