| HDF5Array-class {HDF5Array} | R Documentation |
We provide 2 classes for representing an (on-disk) HDF5 dataset as an array-like object in R:
HDF5Array: A high-level class HDF5Array that extends DelayedArray. All the operations available on DelayedArray objects work on HDF5Array objects.
HDF5ArraySeed: A low-level class for pointing to an HDF5 dataset. No operation can be performed directly on an HDF5ArraySeed object. It needs to be wrapped in a DelayedArray or HDF5Array object first. An HDF5Array object is just an HDF5ArraySeed object wrapped in a DelayedArray object.
## Constructor functions: HDF5Array(filepath, name, type=NA) HDF5ArraySeed(filepath, name, type=NA)
filepath |
The path (as a single character string) to the HDF5 file where the dataset is located. |
name |
The name of the dataset in the HDF5 file. |
type |
|
An HDF5Array object for HDF5Array().
An HDF5ArraySeed object for HDF5ArraySeed().
DelayedArray objects.
DelayedArray-utils for common operations on DelayedArray objects.
writeHDF5Array for writting an array-like object
to an HDF5 file.
HDF5-dump-management for controlling the location of automatically created HDF5 datasets.
saveHDF5SummarizedExperiment and
loadHDF5SummarizedExperiment in this
package (the HDF5Array package) for saving/loading
an HDF5-based SummarizedExperiment
object to/from disk.
h5ls in the rhdf5 package.
The rhdf5 package on top of which HDF5Array objects are implemented.
array objects in base R.
## ---------------------------------------------------------------------
## CONSTRUCTION
## ---------------------------------------------------------------------
library(rhdf5)
library(h5vcData)
tally_file <- system.file("extdata", "example.tally.hfs5",
package="h5vcData")
h5ls(tally_file)
## Pick up "Coverages" dataset for Human chromosome 16:
cov0 <- HDF5Array(tally_file, "/ExampleStudy/16/Coverages")
cov0
## ---------------------------------------------------------------------
## dim/dimnames
## ---------------------------------------------------------------------
dim(cov0)
dimnames(cov0)
dimnames(cov0) <- list(paste0("s", 1:6), c("+", "-"))
dimnames(cov0)
## ---------------------------------------------------------------------
## SLICING (A.K.A. SUBSETTING)
## ---------------------------------------------------------------------
cov1 <- cov0[ , , 29000001:29000007]
cov1
dim(cov1)
as.array(cov1)
stopifnot(identical(dim(as.array(cov1)), dim(cov1)))
stopifnot(identical(dimnames(as.array(cov1)), dimnames(cov1)))
cov2 <- cov0[ , "+", 29000001:29000007]
cov2
as.matrix(cov2)
## ---------------------------------------------------------------------
## SummarizedExperiment OBJECTS WITH DELAYED ASSAYS
## ---------------------------------------------------------------------
## DelayedArray objects can be used inside a SummarizedExperiment object
## to hold the assay data and to delay operations on them.
library(SummarizedExperiment)
pcov <- cov0[ , 1, ] # coverage on plus strand
mcov <- cov0[ , 2, ] # coverage on minus strand
nrow(pcov) # nb of samples
ncol(pcov) # length of Human chromosome 16
## The convention for a SummarizedExperiment object is to have 1 column
## per sample so first we need to transpose 'pcov' and 'mcov':
pcov <- t(pcov)
mcov <- t(mcov)
se <- SummarizedExperiment(list(pcov=pcov, mcov=mcov))
se
stopifnot(validObject(se, complete=TRUE))
## A GPos object can be used to represent the genomic positions along
## the dataset:
gpos <- GPos(GRanges("16", IRanges(1, nrow(se))))
gpos
rowRanges(se) <- gpos
se
stopifnot(validObject(se))
assays(se)$pcov
assays(se)$mcov