| DataFrame-class {S4Vectors} | R Documentation |
The DataFrame class extends the RectangularData virtual
class supports the storage of any type of object (with length
and [ methods) as columns.
On the whole, the DataFrame behaves very similarly to
data.frame, in terms of construction, subsetting, splitting,
combining, etc. The most notable exceptions have to do with handling
of the row names:
The row names are optional. This means calling rownames(x)
will return NULL if there are no row names. Of course, it
could return seq_len(nrow(x)), but returning NULL
informs, for example, combination functions that no row names are
desired (they are often a luxury when dealing with large data).
The row names are not required to be unique.
Subsetting by row names does not use partial matching.
As DataFrame derives from Vector, it is
possible to set an annotation string. Also, another
DataFrame can hold metadata on the columns.
For a class to be supported as a column, it must have length
and [ methods, where [ supports subsetting only by
i and respects drop=FALSE. Optionally, a method may be
defined for the showAsCell generic, which should return a
vector of the same length as the subset of the column passed to
it. This vector is then placed into a data.frame and converted
to text with format. Thus, each element of the vector should be
some simple, usually character, representation of the corresponding
element in the column.
DataFrame(..., row.names = NULL, check.names = TRUE,
stringsAsFactors)
Constructs a DataFrame in similar fashion to
data.frame. Each argument in ... is coerced to
a DataFrame and combined column-wise.
The row names should be given in
row.names; otherwise, they are inherited from the
arguments, as in data.frame. Explicitly passing
NULL to row.names ensures that there are no rownames.
If check.names is TRUE, the column names will
be checked for syntactic validity and made unique, if necessary.
To store an object of a class that does not support coercion to
DataFrame, wrap it in I(). The class must still have
methods for length and [.
The stringsAsFactors argument is ignored. The coercion of
column arguments to DataFrame determines whether strings
become factors.
make_zero_col_DFrame(nrow)
Constructs a zero-column DFrame object with nrow rows.
Intended for developers to use in other packages and typically
not needed by the end user.
In the following code snippets, x is a DataFrame.
dim(x):
Get the length two integer vector indicating in the first and
second element the number of rows and columns, respectively.
dimnames(x), dimnames(x) <- value:
Get and set the two element list containing the row names
(character vector of length nrow(x) or NULL)
and the column names (character vector of length ncol(x)).
as(from, "DataFrame"):
By default, constructs a new DataFrame with from as
its only column. If from is a matrix or
data.frame, all of its columns become columns in the new
DataFrame. If from is a list, each element becomes a
column, recycling as necessary. Note that for the DataFrame
to behave correctly, each column object must support element-wise
subsetting via the [ method and return the number of elements with
length. It is recommended to use the DataFrame
constructor, rather than this interface.
as.list(x): Coerces x, a DataFrame,
to a list.
as.data.frame(x, row.names=NULL, optional=FALSE):
Coerces x, a DataFrame, to a data.frame.
Each column is coerced to a data.frame and then column
bound together. If row.names is NULL, they are
retrieved from x, if it has any. Otherwise, they are
inferred by the data.frame constructor.
NOTE: conversion of x to a data.frame is not
supported if x contains any list, SimpleList,
or CompressedList columns.
as(from, "data.frame"): Coerces a DataFrame
to a data.frame by calling as.data.frame(from).
as.matrix(x): Coerces the DataFrame to a
matrix, if possible.
as.env(x, enclos = parent.frame()):
Creates an environment from x with a symbol for each
colnames(x). The values are not actually copied into the
environment. Rather, they are dynamically bound using
makeActiveBinding. This prevents unnecessary copying
of the data from the external vectors into R vectors. The values
are cached, so that the data is not copied every time the symbol
is accessed.
In the following code snippets, x is a DataFrame.
x[i,j,drop]: Behaves very similarly to the
[.data.frame method, except i can be a
logical Rle object and subsetting by matrix indices
is not supported. Indices containing NA's are also not
supported.
x[i,j] <- value: Behaves very similarly to the
[<-.data.frame method.
x[[i]]: Behaves very similarly to the
[[.data.frame method, except arguments j
and exact are not supported. Column name matching is
always exact. Subsetting by matrices is not supported.
x[[i]] <- value: Behaves very similarly to the
[[<-.data.frame method, except argument j
is not supported.
Michael Lawrence
DataFrame-combine for combining DataFrame objects.
DataFrame-utils for other common operations on DataFrame objects.
TransposedDataFrame objects.
RectangularData and SimpleList which DataFrame extends directly.
score <- c(1L, 3L, NA)
counts <- c(10L, 2L, NA)
row.names <- c("one", "two", "three")
df <- DataFrame(score) # single column
df[["score"]]
df <- DataFrame(score, row.names = row.names) #with row names
rownames(df)
df <- DataFrame(vals = score) # explicit naming
df[["vals"]]
# arrays
ary <- array(1:4, c(2,1,2))
sw <- DataFrame(I(ary))
# a data.frame
sw <- DataFrame(swiss)
as.data.frame(sw) # swiss, without row names
# now with row names
sw <- DataFrame(swiss, row.names = rownames(swiss))
as.data.frame(sw) # swiss
# subsetting
sw[] # identity subset
sw[,] # same
sw[NULL] # no columns
sw[,NULL] # no columns
sw[NULL,] # no rows
## select columns
sw[1:3]
sw[,1:3] # same as above
sw[,"Fertility"]
sw[,c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)]
## select rows and columns
sw[4:5, 1:3]
sw[1] # one-column DataFrame
## the same
sw[, 1, drop = FALSE]
sw[, 1] # a (unnamed) vector
sw[[1]] # the same
sw[["Fertility"]]
sw[["Fert"]] # should return 'NULL'
sw[1,] # a one-row DataFrame
sw[1,, drop=TRUE] # a list
## duplicate row, unique row names are created
sw[c(1, 1:2),]
## indexing by row names
sw["Courtelary",]
subsw <- sw[1:5,1:4]
subsw["C",] # no partial match (unlike with data.frame)
## row and column names
cn <- paste("X", seq_len(ncol(swiss)), sep = ".")
colnames(sw) <- cn
colnames(sw)
rn <- seq(nrow(sw))
rownames(sw) <- rn
rownames(sw)
## column replacement
df[["counts"]] <- counts
df[["counts"]]
df[[3]] <- score
df[["X"]]
df[[3]] <- NULL # deletion