[R] docs update - callbacks and parameter style
This commit is contained in:
@@ -7,7 +7,7 @@
|
||||
xgb.cv(params = list(), data, nrounds, nfold, label = NULL, missing = NA,
|
||||
prediction = FALSE, showsd = TRUE, metrics = list(), obj = NULL,
|
||||
feval = NULL, stratified = TRUE, folds = NULL, verbose = TRUE,
|
||||
print.every.n = 1L, early.stop.round = NULL, maximize = NULL,
|
||||
print_every_n = 1L, early_stopping_rounds = NULL, maximize = NULL,
|
||||
callbacks = list(), ...)
|
||||
}
|
||||
\arguments{
|
||||
@@ -19,11 +19,11 @@ xgb.cv(params = list(), data, nrounds, nfold, label = NULL, missing = NA,
|
||||
\item \code{binary:logistic} logistic regression for classification
|
||||
}
|
||||
\item \code{eta} step size of each boosting step
|
||||
\item \code{max.depth} maximum depth of the tree
|
||||
\item \code{max_depth} maximum depth of the tree
|
||||
\item \code{nthread} number of thread used in training, if not set, all threads are used
|
||||
}
|
||||
|
||||
See \link{xgb.train} for further details.
|
||||
See \code{\link{xgb.train}} for further details.
|
||||
See also demo/ for walkthrough example in R.}
|
||||
|
||||
\item{data}{takes an \code{xgb.DMatrix} or \code{Matrix} as the input.}
|
||||
@@ -32,14 +32,16 @@ xgb.cv(params = list(), data, nrounds, nfold, label = NULL, missing = NA,
|
||||
|
||||
\item{nfold}{the original dataset is randomly partitioned into \code{nfold} equal size subsamples.}
|
||||
|
||||
\item{label}{option field, when data is \code{Matrix}}
|
||||
\item{label}{vector of response values. Should be provided only when data is \code{DMatrix}.}
|
||||
|
||||
\item{missing}{Missing is only used when input is dense matrix, pick a float
|
||||
value that represents missing value. Sometime a data use 0 or other extreme value to represents missing values.}
|
||||
\item{missing}{is only used when input is a dense matrix. By default is set to NA, which means
|
||||
that NA values should be considered as 'missing' by the algorithm.
|
||||
Sometimes, 0 or other extreme value might be used to represent missing values.}
|
||||
|
||||
\item{prediction}{A logical value indicating whether to return the prediction vector.}
|
||||
\item{prediction}{A logical value indicating whether to return the test fold predictions
|
||||
from each CV model. This parameter engages the \code{\link{cb.cv.predict}} callback.}
|
||||
|
||||
\item{showsd}{\code{boolean}, whether show standard deviation of cross validation}
|
||||
\item{showsd}{\code{boolean}, whether to show standard deviation of cross validation}
|
||||
|
||||
\item{metrics, }{list of evaluation metrics to be used in cross validation,
|
||||
when it is not specified, the evaluation metric is chosen according to objective function.
|
||||
@@ -59,34 +61,61 @@ gradient with given prediction and dtrain.}
|
||||
\code{list(metric='metric-name', value='metric-value')} with given
|
||||
prediction and dtrain.}
|
||||
|
||||
\item{stratified}{\code{boolean} whether sampling of folds should be stratified by the values of labels in \code{data}}
|
||||
\item{stratified}{a \code{boolean} indicating whether sampling of folds should be stratified
|
||||
by the values of outcome labels.}
|
||||
|
||||
\item{folds}{\code{list} provides a possibility of using a list of pre-defined CV folds (each element must be a vector of fold's indices).
|
||||
If folds are supplied, the nfold and stratified parameters would be ignored.}
|
||||
\item{folds}{\code{list} provides a possibility to use a list of pre-defined CV folds
|
||||
(each element must be a vector of test fold's indices). When folds are supplied,
|
||||
the \code{nfold} and \code{stratified} parameters are ignored.}
|
||||
|
||||
\item{verbose}{\code{boolean}, print the statistics during the process}
|
||||
|
||||
\item{print.every.n}{Print every N progress messages when \code{verbose>0}. Default is 1 which means all messages are printed.}
|
||||
\item{print_every_n}{Print each n-th iteration evaluation messages when \code{verbose>0}.
|
||||
Default is 1 which means all messages are printed. This parameter is passed to the
|
||||
\code{\link{cb.print.evaluation}} callback.}
|
||||
|
||||
\item{early.stop.round}{If \code{NULL}, the early stopping function is not triggered.
|
||||
\item{early_stopping_rounds}{If \code{NULL}, the early stopping function is not triggered.
|
||||
If set to an integer \code{k}, training with a validation set will stop if the performance
|
||||
doesn't improve for \code{k} rounds.}
|
||||
doesn't improve for \code{k} rounds.
|
||||
Setting this parameter engages the \code{\link{cb.early.stop}} callback.}
|
||||
|
||||
\item{maximize}{If \code{feval} and \code{early.stop.round} are set, then \code{maximize} must be set as well.
|
||||
\code{maximize=TRUE} means the larger the evaluation score the better.}
|
||||
\item{maximize}{If \code{feval} and \code{early_stopping_rounds} are set,
|
||||
then this parameter must be set as well.
|
||||
When it is \code{TRUE}, it means the larger the evaluation score the better.
|
||||
This parameter is passed to the \code{\link{cb.early.stop}} callback.}
|
||||
|
||||
\item{callbacks}{a list of callback functions to perform various task during boosting.
|
||||
See \code{\link{callbacks}}. Some of the callbacks are automatically created depending on the
|
||||
parameters' values. User can provide either existing or their own callback methods in order
|
||||
to customize the training process.}
|
||||
|
||||
\item{...}{other parameters to pass to \code{params}.}
|
||||
}
|
||||
\value{
|
||||
TODO: update this...
|
||||
|
||||
If \code{prediction = TRUE}, a list with the following elements is returned:
|
||||
An object of class \code{xgb.cv.synchronous} with the following elements:
|
||||
\itemize{
|
||||
\item \code{dt} a \code{data.table} with each mean and standard deviation stat for training set and test set
|
||||
\item \code{pred} an array or matrix (for multiclass classification) with predictions for each CV-fold for the model having been trained on the data in all other folds.
|
||||
\item \code{call} a function call.
|
||||
\item \code{params} parameters that were passed to the xgboost library. Note that it does not
|
||||
capture parameters changed by the \code{\link{cb.reset.parameters}} callback.
|
||||
\item \code{callbacks} callback functions that were either automatically assigned or
|
||||
explicitely passed.
|
||||
\item \code{evaluation_log} evaluation history storead as a \code{data.table} with the
|
||||
first column corresponding to iteration number and the rest corresponding to the
|
||||
CV-based evaluation means and standard deviations for the training and test CV-sets.
|
||||
It is created by the \code{\link{cb.evaluation.log}} callback.
|
||||
\item \code{niter} number of boosting iterations.
|
||||
\item \code{folds} the list of CV folds' indices - either those passed through the \code{folds}
|
||||
parameter or randomly generated.
|
||||
\item \code{best_iteration} iteration number with the best evaluation metric value
|
||||
(only available with early stopping).
|
||||
\item \code{best_ntreelimit} the \code{ntreelimit} value corresponding to the best iteration,
|
||||
which could further be used in \code{predict} method
|
||||
(only available with early stopping).
|
||||
\item \code{pred} CV prediction values available when \code{prediction} is set.
|
||||
It is either vector or matrix (see \code{\link{cb.cv.predict}}).
|
||||
\item \code{models} a liost of the CV folds' models. It is only available with the explicit
|
||||
setting of the \code{cb.cv.predict(save_models = TRUE)} callback.
|
||||
}
|
||||
|
||||
If \code{prediction = FALSE}, just a \code{data.table} with each mean and standard deviation stat for training set and test set is returned.
|
||||
}
|
||||
\description{
|
||||
The cross valudation function of xgboost
|
||||
@@ -105,9 +134,10 @@ Adapted from \url{http://en.wikipedia.org/wiki/Cross-validation_\%28statistics\%
|
||||
\examples{
|
||||
data(agaricus.train, package='xgboost')
|
||||
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
|
||||
history <- xgb.cv(data = dtrain, nround=3, nthread = 2, nfold = 5, metrics=list("rmse","auc"),
|
||||
max.depth =3, eta = 1, objective = "binary:logistic")
|
||||
print(history)
|
||||
cv <- xgb.cv(data = dtrain, nrounds = 3, nthread = 2, nfold = 5, metrics = list("rmse","auc"),
|
||||
max_depth = 3, eta = 1, objective = "binary:logistic")
|
||||
print(cv)
|
||||
print(cv, verbose=TRUE)
|
||||
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user