[R] various R code maintenance (#1964)
* [R] xgb.save must work when handle in nil but raw exists * [R] print.xgb.Booster should still print other info when handle is nil * [R] rename internal function xgb.Booster to xgb.Booster.handle to make its intent clear * [R] rename xgb.Booster.check to xgb.Booster.complete and make it visible; more docs * [R] storing evaluation_log should depend only on watchlist, not on verbose * [R] reduce the excessive chattiness of unit tests * [R] only disable some tests in windows when it's not 64-bit * [R] clean-up xgb.DMatrix * [R] test xgb.DMatrix loading from libsvm text file * [R] store feature_names in xgb.Booster, use them from utility functions * [R] remove non-functional co-occurence computation from xgb.importance * [R] verbose=0 is enough without a callback * [R] added forgotten xgb.Booster.complete.Rd; cran check fixes * [R] update installation instructions
This commit is contained in:
committed by
Tianqi Chen
parent
a073a2c3d4
commit
2b5b96d760
@@ -23,8 +23,7 @@ xgboost(data = NULL, label = NULL, missing = NA, weight = NULL,
|
||||
1. General Parameters
|
||||
|
||||
\itemize{
|
||||
\item \code{booster} which booster to use, can be \code{gbtree} or \code{gblinear}. Default: \code{gbtree}
|
||||
\item \code{silent} 0 means printing running messages, 1 means silent mode. Default: 0
|
||||
\item \code{booster} which booster to use, can be \code{gbtree} or \code{gblinear}. Default: \code{gbtree}.
|
||||
}
|
||||
|
||||
2. Booster Parameters
|
||||
@@ -68,16 +67,19 @@ xgboost(data = NULL, label = NULL, missing = NA, weight = NULL,
|
||||
\item \code{eval_metric} evaluation metrics for validation data. Users can pass a self-defined function to it. Default: metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking). List is provided in detail section.
|
||||
}}
|
||||
|
||||
\item{data}{input dataset. \code{xgb.train} takes only an \code{xgb.DMatrix} as the input.
|
||||
\code{xgboost}, in addition, also accepts \code{matrix}, \code{dgCMatrix}, or local data file.}
|
||||
\item{data}{training dataset. \code{xgb.train} accepts only an \code{xgb.DMatrix} as the input.
|
||||
\code{xgboost}, in addition, also accepts \code{matrix}, \code{dgCMatrix}, or name of a local data file.}
|
||||
|
||||
\item{nrounds}{the max number of iterations}
|
||||
\item{nrounds}{max number of boosting iterations.}
|
||||
|
||||
\item{watchlist}{what information should be printed when \code{verbose=1} or
|
||||
\code{verbose=2}. Watchlist is used to specify validation set monitoring
|
||||
during training. For example user can specify
|
||||
watchlist=list(validation1=mat1, validation2=mat2) to watch
|
||||
the performance of each round's model on mat1 and mat2}
|
||||
\item{watchlist}{named list of xgb.DMatrix datasets to use for evaluating model performance.
|
||||
Metrics specified in either \code{eval_metric} or \code{feval} will be computed for each
|
||||
of these datasets during each boosting iteration, and stored in the end as a field named
|
||||
\code{evaluation_log} in the resulting object. When either \code{verbose>=1} or
|
||||
\code{\link{cb.print.evaluation}} callback is engaged, the performance results are continuously
|
||||
printed out during the training.
|
||||
E.g., specifying \code{watchlist=list(validation1=mat1, validation2=mat2)} allows to track
|
||||
the performance of each round's model on mat1 and mat2.}
|
||||
|
||||
\item{obj}{customized objective function. Returns gradient and second order
|
||||
gradient with given prediction and dtrain.}
|
||||
@@ -86,10 +88,10 @@ gradient with given prediction and dtrain.}
|
||||
\code{list(metric='metric-name', value='metric-value')} with given
|
||||
prediction and dtrain.}
|
||||
|
||||
\item{verbose}{If 0, xgboost will stay silent. If 1, xgboost will print
|
||||
information of performance. If 2, xgboost will print some additional information.
|
||||
Setting \code{verbose > 0} automatically engages the \code{\link{cb.evaluation.log}} and
|
||||
\code{\link{cb.print.evaluation}} callback functions.}
|
||||
\item{verbose}{If 0, xgboost will stay silent. If 1, it will print information about performance.
|
||||
If 2, some additional information will be printed out.
|
||||
Note that setting \code{verbose > 0} automatically engages the
|
||||
\code{cb.print.evaluation(period=1)} callback function.}
|
||||
|
||||
\item{print_every_n}{Print each n-th iteration evaluation messages when \code{verbose>0}.
|
||||
Default is 1 which means all messages are printed. This parameter is passed to the
|
||||
@@ -151,17 +153,20 @@ An object of class \code{xgb.Booster} with the following elements:
|
||||
(only available with early stopping).
|
||||
\item \code{best_score} the best evaluation metric value during early stopping.
|
||||
(only available with early stopping).
|
||||
\item \code{feature_names} names of the training dataset features
|
||||
(only when comun names were defined in training data).
|
||||
}
|
||||
}
|
||||
\description{
|
||||
\code{xgb.train} is an advanced interface for training an xgboost model. The \code{xgboost} function provides a simpler interface.
|
||||
\code{xgb.train} is an advanced interface for training an xgboost model.
|
||||
The \code{xgboost} function is a simpler wrapper for \code{xgb.train}.
|
||||
}
|
||||
\details{
|
||||
These are the training functions for \code{xgboost}.
|
||||
|
||||
The \code{xgb.train} interface supports advanced features such as \code{watchlist},
|
||||
customized objective and evaluation metric functions, therefore it is more flexible
|
||||
than the \code{\link{xgboost}} interface.
|
||||
than the \code{xgboost} interface.
|
||||
|
||||
Parallelization is automatically enabled if \code{OpenMP} is present.
|
||||
Number of threads can also be manually specified via \code{nthread} parameter.
|
||||
@@ -187,7 +192,7 @@ The following callbacks are automatically created when certain parameters are se
|
||||
\itemize{
|
||||
\item \code{cb.print.evaluation} is turned on when \code{verbose > 0};
|
||||
and the \code{print_every_n} parameter is passed to it.
|
||||
\item \code{cb.evaluation.log} is on when \code{verbose > 0} and \code{watchlist} is present.
|
||||
\item \code{cb.evaluation.log} is on when \code{watchlist} is present.
|
||||
\item \code{cb.early.stop}: when \code{early_stopping_rounds} is set.
|
||||
\item \code{cb.save.model}: when \code{save_period > 0} is set.
|
||||
}
|
||||
@@ -198,7 +203,7 @@ data(agaricus.test, package='xgboost')
|
||||
|
||||
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
|
||||
dtest <- xgb.DMatrix(agaricus.test$data, label = agaricus.test$label)
|
||||
watchlist <- list(eval = dtest, train = dtrain)
|
||||
watchlist <- list(train = dtrain, eval = dtest)
|
||||
|
||||
## A simple xgb.train example:
|
||||
param <- list(max_depth = 2, eta = 1, silent = 1, nthread = 2,
|
||||
@@ -237,17 +242,15 @@ bst <- xgb.train(param, dtrain, nrounds = 2, watchlist,
|
||||
|
||||
|
||||
## An xgb.train example of using variable learning rates at each iteration:
|
||||
param <- list(max_depth = 2, eta = 1, silent = 1, nthread = 2)
|
||||
param <- list(max_depth = 2, eta = 1, silent = 1, nthread = 2,
|
||||
objective = "binary:logistic", eval_metric = "auc")
|
||||
my_etas <- list(eta = c(0.5, 0.1))
|
||||
bst <- xgb.train(param, dtrain, nrounds = 2, watchlist,
|
||||
callbacks = list(cb.reset.parameters(my_etas)))
|
||||
|
||||
|
||||
## Explicit use of the cb.evaluation.log callback allows to run
|
||||
## xgb.train silently but still store the evaluation results:
|
||||
bst <- xgb.train(param, dtrain, nrounds = 2, watchlist,
|
||||
verbose = 0, callbacks = list(cb.evaluation.log()))
|
||||
print(bst$evaluation_log)
|
||||
## Early stopping:
|
||||
bst <- xgb.train(param, dtrain, nrounds = 25, watchlist,
|
||||
early_stopping_rounds = 3)
|
||||
|
||||
## An 'xgboost' interface example:
|
||||
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label,
|
||||
|
||||
Reference in New Issue
Block a user