| cv.pengls {pengls} | R Documentation |
Peform cross-validation pengls
cv.pengls( data, glsSt, xNames, outVar, corMat, nfolds, foldid, cvType = "blocked", lambdas, transFun = "identity", transFunArgs = list(), ... )
data |
A data matrix or data frame |
glsSt |
a covariance structure, as supplied to nlme::gls as "correlation" |
xNames |
names of the regressors in data |
outVar |
name of the outcome variable in data |
corMat |
a starting value for th correlation matrix. Taken to be a diagonal matrix if missing |
nfolds |
an integer, the number of folds used in cv.glmnet to find lambda |
foldid |
An optional vector deffining the fold |
cvType |
A character vector defining the type of cross-validation. Either "random" or "blocked", ignored if foldid is provided |
lambdas |
an optional lambda sequence |
transFun |
a transformation function to apply to predictions and outcome in the cross-validation |
transFunArgs |
Additional arguments passed onto transFun |
... |
passed onto glmnet::glmnet |
A list with components
lambda |
The series of lambdas |
cvm |
The vector of mean R2's |
cvsd |
The standard error of R2 at the maximum |
cvOpt |
The R2 according to the 1 standard error rule |
coefs |
The matrix of coefficients for every lambda value |
lambda.min |
Lambda value with maximal R2 |
lambda.1se |
Smallest lambda value within 1 standard error from the maximum |
foldid |
The folds |
glsSt |
The nlme correlation object |
library(nlme)
library(BiocParallel)
n <- 50 #Sample size
p <- 100 #Number of features
g <- 10 #Size of the grid
#Generate grid
Grid <- expand.grid("x" = seq_len(g), "y" = seq_len(g))
# Sample points from grid without replacement
GridSample <- Grid[sample(nrow(Grid), n, replace = FALSE),]
#Generate outcome and regressors
b <- matrix(rnorm(p*n), n , p)
a <- rnorm(n, mean = b %*% rbinom(p, size = 1, p = 0.2)) #20% signal
#Compile to a matrix
df <- data.frame("a" = a, "b" = b, GridSample)
# Define the correlation structure (see ?nlme::gls), with initial nugget 0.5 and range 5
corStruct = corGaus(form = ~ x + y, nugget = TRUE, value = c("range" = 5, "nugget" = 0.5))
#Fit the pengls model, for simplicity for a simple lambda
register(MulticoreParam(3)) #Prepare multithereading
penglsFitCV = cv.pengls(data = df, outVar = "a",
xNames = grep(names(df), pattern = "b", value =TRUE),
glsSt = corStruct, nfolds = 5)
penglsFitCV$lambda.1se #Lambda for 1 standard error rule
penglsFitCV$cvOpt #Corresponding R2
coef(penglsFitCV)
penglsFitCV$foldid #The folds used