computeError.Rd
Compute CV statistics from a matrix of predictions.
computeError( predmat, y, lambda, foldid, type.measure, family, weights = rep(1, dim(predmat)[1]), grouped = TRUE )
predmat | Array of predictions. If `y` is univariate, this has dimensions `c(nobs, nlambda)`. If `y` is multivariate with `nc` levels/columns (e.g. for `family = "multionmial"` or `family = "mgaussian"`), this has dimensions `c(nobs, nc, nlambda)`. Note that these should be on the same scale as `y` (unlike in the glmnet package where it is the linear predictor). |
---|---|
y | Response variable. Either a vector or a matrix, depending on the type of model. |
lambda | Lambda values associated with the errors in `predmat`. |
foldid | Vector of values identifying which fold each observation is in. |
type.measure | Loss function to use for cross-validation. See `availableTypeMeasures()` for possible values for `type.measure`. Note that the package does not check if the user-specified measure is appropriate for the family. |
family | Model family; used to determine the correct loss function. |
weights | Observation weights. |
grouped | This is an experimental argument, with default `TRUE`, and can be ignored by most users. For all models except `family = "cox"`, this refers to computing `nfolds` separate statistics, and then using their mean and estimated standard error to describe the CV curve. If `FALSE`, an error matrix is built up at the observation level from the predictions from the `nfolds` fits, and then summarized (does not apply to `type.measure="auc"`). For the "cox" family, `grouped=TRUE` obtains the CV partial likelihood for the Kth fold by subtraction; by subtracting the log partial likelihood evaluated on the full dataset from that evaluated on the on the (K-1)/K dataset. This makes more efficient use of risk sets. With `grouped=FALSE` the log partial likelihood is computed only on the Kth fold. |
An object of class "cvobj".
The values of lambda used in the fits.
The mean cross-validated error: a vector of length `length(lambda)`.
Estimate of standard error of `cvm`.
Upper curve = `cvm + cvsd`.
Lower curve = `cvm - cvsd`.
Value of `lambda` that gives minimum `cvm`.
Largest value of `lambda` such that the error is within 1 standard error of the minimum.
A one-column matrix with the indices of `lambda.min` and `lambda.1se` in the sequence of coefficients, fits etc.
A text string indicating the loss function used (for plotting purposes).
Note that for the setting where `family = "cox"` and `type.measure = "deviance"` and `grouped = TRUE`, `predmat` needs to have a `cvraw` attribute as computed by `buildPredMat()`. This is because the usual matrix of pre-validated fits does not contain all the information needed to compute the model deviance for this setting.