R-callbacks docs

2016-06-09 02:52:09 -05:00
parent 422b0000a8
commit 2e0ffcc303
12 changed files with 396 additions and 107 deletions
--- a/R-package/NAMESPACE
+++ b/R-package/NAMESPACE
@@ -12,8 +12,15 @@ S3method(slice,xgb.DMatrix)
 export("xgb.attr<-")
 export("xgb.attributes<-")
 export("xgb.parameters<-")
+export(cb.early_stop)
+export(cb.log_evaluation)
+export(cb.print_evaluation)
+export(cb.reset_parameters)
+export(cb.save_model)
 export(getinfo)
+export(print.xgb.Booster)
 export(print.xgb.DMatrix)
+export(print.xgb.cv.synchronous)
 export(setinfo)
 export(slice)
 export(xgb.DMatrix)
@@ -50,7 +57,7 @@ importFrom(data.table,setnames)
 importFrom(magrittr,"%>%")
 importFrom(stringr,str_detect)
 importFrom(stringr,str_extract)
-importFrom(stringr,str_extract_all)
 importFrom(stringr,str_match)
 importFrom(stringr,str_replace)
 importFrom(stringr,str_split)
+useDynLib(xgboost)
--- a/R-package/man/callbacks.Rd
+++ b/R-package/man/callbacks.Rd
@@ -0,0 +1,37 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/callbacks.R
+\name{callbacks}
+\alias{callbacks}
+\title{Callback closures for booster training.}
+\description{
+These are used to perform various service tasks either during boosting iterations or at the end.
+This approach helps to modularize many of such tasks without bloating the main training methods, 
+and it offers .
+}
+\details{
+By default, a callback function is run after each boosting iteration.
+An R-attribute \code{is_pre_iteration} could be set for a callback to define a pre-iteration function.
+
+When a callback function has \code{finalize} parameter, its finalizer part will also be run after 
+the boosting is completed.
+
+WARNING: side-effects!!! Be aware that these callback functions access and modify things in 
+the environment from which they are called from, which is a fairly uncommon thing to do in R.
+
+To write a custom callback closure, make sure you first understand the main concepts about R envoronments.
+Check either the R docs on \code{\link[base]{environment}} or the 
+\href{http://adv-r.had.co.nz/Environments.html}{Environments chapter} from Hadley Wickham's "Advanced R" book.
+Then take a look at the code of \code{cb.reset_learning_rate} for a simple example, 
+and see the \code{cb.log_evaluation} code for something more involved.
+Also, you would need to get familiar with the objects available inside of the \code{xgb.train} internal environment.
+}
+\seealso{
+\code{\link{cb.print_evaluation}},
+\code{\link{cb.log_evaluation}},
+\code{\link{cb.reset_parameters}},
+\code{\link{cb.early_stop}},
+\code{\link{cb.save_model}},
+\code{\link{xgb.train}},
+\code{\link{xgb.cv}}
+}
+
--- a/R-package/man/cb.early_stop.Rd
+++ b/R-package/man/cb.early_stop.Rd
@@ -0,0 +1,64 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/callbacks.R
+\name{cb.early_stop}
+\alias{cb.early_stop}
+\title{Callback closure to activate the early stopping.}
+\usage{
+cb.early_stop(stopping_rounds, maximize = FALSE, metric_name = NULL,
+  verbose = TRUE)
+}
+\arguments{
+\item{stopping_rounds}{The number of rounds with no improvement in 
+the evaluation metric in order to stop the training.}
+
+\item{maximize}{whether to maximize the evaluation metric}
+
+\item{metric_name}{the name of an evaluation column to use as a criteria for early
+stopping. If not set, the last column would be used.
+Let's say the test data in \code{watchlist} was labelled as \code{dtest}, 
+and one wants to use the AUC in test data for early stopping regardless of where 
+it is in the \code{watchlist}, then one of the following would need to be set:
+\code{metric_name='dtest-auc'} or \code{metric_name='dtest_auc'}.
+All dash '-' characters in metric names are considered equivalent to '_'.}
+
+\item{verbose}{whether to print the early stopping information.}
+}
+\description{
+Callback closure to activate the early stopping.
+}
+\details{
+This callback function determines the condition for early stopping 
+by setting the \code{stop_condition = TRUE} flag in its calling frame.
+
+The following additional fields are assigned to the model R object:
+\itemize{
+\item \code{best_score} the evaluation score at the best iteration
+\item \code{best_iteration} at which boosting iteration the best score has occurred (1-based index)
+\item \code{best_ntreelimit} to use with the \code{ntreelimit} parameter in \code{predict}.
+   It differs from \code{best_iteration} in multiclass or random forest settings.
+}
+
+The Same values are also stored as xgb-attributes, however:
+\itemize{
+\item \code{best_iteration} is stored as a 0-based iteration index (for interoperability of binary models)
+\item \code{best_msg} message string is also stored.
+}
+
+At least one data element is required in the evaluation watchlist for early stopping to work.
+
+Callback function expects the following values to be set in its calling frame:
+\code{stop_condition},
+\code{bst_evaluation},
+\code{rank},
+\code{bst} or \code{bst_folds},
+\code{iteration},
+\code{begin_iteration},
+\code{end_iteration},
+\code{num_parallel_tree},
+\code{num_class}.
+}
+\seealso{
+\code{\link{callbacks}},
+\code{\link{xgb.attr}}
+}
+
--- a/R-package/man/cb.log_evaluation.Rd
+++ b/R-package/man/cb.log_evaluation.Rd
@@ -0,0 +1,32 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/callbacks.R
+\name{cb.log_evaluation}
+\alias{cb.log_evaluation}
+\title{Callback closure for logging the evaluation history}
+\usage{
+cb.log_evaluation()
+}
+\description{
+Callback closure for logging the evaluation history
+}
+\details{
+This callback function appends the current iteration evaluation results \code{bst_evaluation}
+available in the calling parent frame to the \code{evaluation_log} list in a calling frame.
+
+The finalizer callback (called with \code{finalize = TURE} in the end) converts 
+the \code{evaluation_log} list into a final data.table.
+
+The iteration evaluation result \code{bst_evaluation} must be a named numeric vector. 
+
+Note: in the column names of the final data.table, the dash '-' character is replaced with 
+the underscore '_' in order to make the column names more like regular R identifiers.
+
+Callback function expects the following values to be set in its calling frame:
+\code{evaluation_log},
+\code{bst_evaluation},
+\code{iteration}.
+}
+\seealso{
+\code{\link{callbacks}}
+}
+
--- a/R-package/man/cb.print_evaluation.Rd
+++ b/R-package/man/cb.print_evaluation.Rd
@@ -0,0 +1,28 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/callbacks.R
+\name{cb.print_evaluation}
+\alias{cb.print_evaluation}
+\title{Callback closure for printing the result of evaluation}
+\usage{
+cb.print_evaluation(period = 1)
+}
+\arguments{
+\item{period}{results would be printed every number of periods}
+}
+\description{
+Callback closure for printing the result of evaluation
+}
+\details{
+The callback function prints the result of evaluation at every \code{period} iterations.
+The initial and the last iteration's evaluations are always printed.
+
+Callback function expects the following values to be set in its calling frame:
+\code{bst_evaluation} (also \code{bst_evaluation_err} when available),
+\code{iteration},
+\code{begin_iteration},
+\code{end_iteration}.
+}
+\seealso{
+\code{\link{callbacks}}
+}
+
--- a/R-package/man/cb.reset_parameters.Rd
+++ b/R-package/man/cb.reset_parameters.Rd
@@ -0,0 +1,37 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/callbacks.R
+\name{cb.reset_parameters}
+\alias{cb.reset_parameters}
+\title{Callback closure for restetting the booster's parameters at each iteration.}
+\usage{
+cb.reset_parameters(new_params)
+}
+\arguments{
+\item{new_params}{a list where each element corresponds to a parameter that needs to be reset.
+Each element's value must be either a vector of values of length \code{nrounds} 
+to be set at each iteration, 
+or a function of two parameters \code{learning_rates(iteration, nrounds)} 
+which returns a new parameter value by using the current iteration number 
+and the total number of boosting rounds.}
+}
+\description{
+Callback closure for restetting the booster's parameters at each iteration.
+}
+\details{
+This is a "pre-iteration" callback function used to reset booster's parameters
+at the beginning of each iteration.
+
+Note that when training is resumed from some previous model, and a function is used to 
+reset a parameter value, the \code{nround} argument in this function would be the 
+the number of boosting rounds in the current training.
+
+Callback function expects the following values to be set in its calling frame:
+\code{bst} or \code{bst_folds},
+\code{iteration},
+\code{begin_iteration},
+\code{end_iteration}.
+}
+\seealso{
+\code{\link{callbacks}}
+}
+
--- a/R-package/man/cb.save_model.Rd
+++ b/R-package/man/cb.save_model.Rd
@@ -0,0 +1,34 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/callbacks.R
+\name{cb.save_model}
+\alias{cb.save_model}
+\title{Callback closure for saving a model file.}
+\usage{
+cb.save_model(save_period = 0, save_name = "xgboost.model")
+}
+\arguments{
+\item{save_period}{save the model to disk after every 
+\code{save_period} iterations; 0 means save the model at the end.}
+
+\item{save_name}{the name or path for the saved model file.
+It can contain a \code{\link[base]{sprintf}} formatting specifier 
+to include the integer iteration number in the file name.
+E.g., with \code{save_name} = 'xgboost_%04d.model', 
+the file saved at iteration 50 would be named "xgboost_0050.model".}
+}
+\description{
+Callback closure for saving a model file.
+}
+\details{
+This callback function allows to save an xgb-model file, either periodically after each \code{save_period}'s or at the end.
+
+Callback function expects the following values to be set in its calling frame:
+\code{bst},
+\code{iteration},
+\code{begin_iteration},
+\code{end_iteration}.
+}
+\seealso{
+\code{\link{callbacks}}
+}
+
--- a/R-package/man/print.xgb.Booster.Rd
+++ b/R-package/man/print.xgb.Booster.Rd
@@ -0,0 +1,30 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/xgb.Booster.R
+\name{print.xgb.Booster}
+\alias{print.xgb.Booster}
+\title{Print xgb.Booster}
+\usage{
+print.xgb.Booster(x, verbose = FALSE, ...)
+}
+\arguments{
+\item{x}{an xgb.Booster object}
+
+\item{verbose}{whether to print detailed data (e.g., attribute values)}
+
+\item{...}{not currently used}
+}
+\description{
+Print information about xgb.Booster.
+}
+\examples{
+data(agaricus.train, package='xgboost')
+train <- agaricus.train
+bst <- xgboost(data = train$data, label = train$label, max.depth = 2,
+               eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
+attr(bst, 'myattr') <- 'memo'
+
+print(bst)
+print(bst, verbose=TRUE)
+
+}
+
--- a/R-package/man/print.xgb.cv.Rd
+++ b/R-package/man/print.xgb.cv.Rd
@@ -0,0 +1,32 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/xgb.cv.R
+\name{print.xgb.cv.synchronous}
+\alias{print.xgb.cv.synchronous}
+\title{Print xgb.cv result}
+\usage{
+print.xgb.cv.synchronous(x, verbose = FALSE, ...)
+}
+\arguments{
+\item{x}{an \code{xgb.cv.synchronous} object}
+
+\item{verbose}{whether to print detailed data}
+
+\item{...}{passed to \code{data.table.print}}
+}
+\description{
+Prints formatted results of \code{xgb.cv}.
+}
+\details{
+When not verbose, it would only print the evaluation results, 
+including the best iteration (when available).
+}
+\examples{
+data(agaricus.train, package='xgboost')
+train <- agaricus.train
+cv <- xgbcv(data = train$data, label = train$label, max.depth = 2,
+               eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
+print(cv)
+print(cv, verbose=TRUE)
+
+}
+
--- a/R-package/man/xgb.cv.Rd
+++ b/R-package/man/xgb.cv.Rd
@@ -6,8 +6,9 @@
 \usage{
 xgb.cv(params = list(), data, nrounds, nfold, label = NULL, missing = NA,
  prediction = FALSE, showsd = TRUE, metrics = list(), obj = NULL,
-  feval = NULL, stratified = TRUE, folds = NULL, verbose = T,
-  print.every.n = 1L, early.stop.round = NULL, maximize = NULL, ...)
+  feval = NULL, stratified = TRUE, folds = NULL, verbose = TRUE,
+  print.every.n = 1L, early.stop.round = NULL, maximize = NULL,
+  callbacks = list(), ...)
 }
 \arguments{
 \item{params}{the list of parameters. Commonly used ones are:
@@ -40,7 +41,7 @@ value that represents missing value. Sometime a data use 0 or other extreme valu

 \item{showsd}{\code{boolean}, whether show standard deviation of cross validation}

-\item{metrics, }{list of evaluation metrics to be used in corss validation,
+\item{metrics, }{list of evaluation metrics to be used in cross validation,
  when it is not specified, the evaluation metric is chosen according to objective function.
  Possible options are:
 \itemize{
@@ -69,7 +70,7 @@ If folds are supplied, the nfold and stratified parameters would be ignored.}

 \item{early.stop.round}{If \code{NULL}, the early stopping function is not triggered. 
 If set to an integer \code{k}, training with a validation set will stop if the performance 
-keeps getting worse consecutively for \code{k} rounds.}
+doesn't improve for \code{k} rounds.}

 \item{maximize}{If \code{feval} and \code{early.stop.round} are set, then \code{maximize} must be set as well.
 \code{maximize=TRUE} means the larger the evaluation score the better.}
@@ -77,6 +78,8 @@ keeps getting worse consecutively for \code{k} rounds.}
 \item{...}{other parameters to pass to \code{params}.}
 }
 \value{
+TODO: update this...
+
 If \code{prediction = TRUE}, a list with the following elements is returned:
 \itemize{
  \item \code{dt} a \code{data.table} with each mean and standard deviation stat for training set and test set
@@ -105,5 +108,6 @@ dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
 history <- xgb.cv(data = dtrain, nround=3, nthread = 2, nfold = 5, metrics=list("rmse","auc"),
                  max.depth =3, eta = 1, objective = "binary:logistic")
 print(history)
+
 }

--- a/R-package/man/xgb.train.Rd
+++ b/R-package/man/xgb.train.Rd
@@ -1,16 +1,24 @@
 % Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/xgb.train.R
+% Please edit documentation in R/xgb.train.R, R/xgboost.R
 \name{xgb.train}
 \alias{xgb.train}
+\alias{xgboost}
 \title{eXtreme Gradient Boosting Training}
 \usage{
 xgb.train(params = list(), data, nrounds, watchlist = list(), obj = NULL,
  feval = NULL, verbose = 1, print.every.n = 1L,
+  early.stop.round = NULL, maximize = NULL, save_period = NULL,
+  save_name = "xgboost.model", xgb_model = NULL, callbacks = list(), ...)
+
+xgboost(data = NULL, label = NULL, missing = NA, weight = NULL,
+  params = list(), nrounds, verbose = 1, print.every.n = 1L,
  early.stop.round = NULL, maximize = NULL, save_period = 0,
-  save_name = "xgboost.model", ...)
+  save_name = "xgboost.model", xgb_model = NULL, callbacks = list(), ...)
 }
 \arguments{
 \item{params}{the list of parameters. 
+       The complete list of parameters is available at \url{http://xgboost.readthedocs.io/en/latest/parameter.html}.
+       Below is a shorter summary:

 1. General Parameters

@@ -59,15 +67,16 @@ xgb.train(params = list(), data, nrounds, watchlist = list(), obj = NULL,
  \item \code{eval_metric} evaluation metrics for validation data. Users can pass a self-defined function to it. Default: metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking). List is provided in detail section.
 }}

-\item{data}{takes an \code{xgb.DMatrix} as the input.}
+\item{data}{input dataset. \code{xgb.train} takes only an \code{xgb.DMatrix} as the input.
+\code{xgboost}, in addition, also accepts \code{matrix}, \code{dgCMatrix}, or local data file.}

 \item{nrounds}{the max number of iterations}

 \item{watchlist}{what information should be printed when \code{verbose=1} or
 \code{verbose=2}. Watchlist is used to specify validation set monitoring
 during training. For example user can specify
- watchlist=list(validation1=mat1, validation2=mat2) to watch
- the performance of each round's model on mat1 and mat2}
+watchlist=list(validation1=mat1, validation2=mat2) to watch
+the performance of each round's model on mat1 and mat2}

 \item{obj}{customized objective function. Returns gradient and second order 
 gradient with given prediction and dtrain,}
@@ -79,53 +88,95 @@ prediction and dtrain,}
 \item{verbose}{If 0, xgboost will stay silent. If 1, xgboost will print 
 information of performance. If 2, xgboost will print information of both}

-\item{print.every.n}{Print every N progress messages when \code{verbose>0}. Default is 1 which means all messages are printed.}
+\item{print.every.n}{Print every N progress messages when \code{verbose>0}.
+Default is 1 which means all messages are printed.}

 \item{early.stop.round}{If \code{NULL}, the early stopping function is not triggered. 
 If set to an integer \code{k}, training with a validation set will stop if the performance 
 keeps getting worse consecutively for \code{k} rounds.}

-\item{maximize}{If \code{feval} and \code{early.stop.round} are set, then \code{maximize} must be set as well.
+\item{maximize}{If \code{feval} and \code{early.stop.round} are set, 
+then \code{maximize} must be set as well.
 \code{maximize=TRUE} means the larger the evaluation score the better.}

-\item{save_period}{save the model to the disk in every \code{save_period} rounds, 0 means no such action.}
+\item{save_period}{save the model to the disk after every \code{save_period} rounds, 0 means save at the end.}

 \item{save_name}{the name or path for periodically saved model file.}

+\item{xgb_model}{the previously built model to continue the trainig from. 
+Could be either an object of class \code{xgb.Booster}, or its raw data, or the name of a 
+file with a previously saved model.}
+
+\item{callbacks}{a list of callback functions to perform various task during boosting. 
+See \code{\link{callbacks}}. Some of the callbacks are currently automatically 
+created when specific parameters are set.}
+
 \item{...}{other parameters to pass to \code{params}.}
+
+\item{label}{the response variable. User should not set this field,
+if data is local data file or \code{xgb.DMatrix}.}
+
+\item{missing}{by default is set to NA, which means that NA values should be considered as 'missing'
+by the algorithm. Sometimes, 0 or other extreme value might be used to represent missing values.
+This parameter is only used when input is dense matrix,}
+
+\item{weight}{a vector indicating the weight for each row of the input.}
+}
+\value{
+TODO
 }
 \description{
-An advanced interface for training xgboost model. Look at \code{\link{xgboost}} function for a simpler interface.
+\code{xgb.train} is an advanced interface for training an xgboost model. The \code{xgboost} function provides a simpler interface.
 }
 \details{
-This is the training function for \code{xgboost}. 
+These are the training functions for \code{xgboost}. 

-It supports advanced features such as \code{watchlist}, customized objective function (\code{feval}),
-therefore it is more flexible than \code{\link{xgboost}} function.
+The \code{xgb.train} interface supports advanced features such as \code{watchlist}, 
+customized objective and evaluation metric functions, therefore it is more flexible 
+than the \code{\link{xgboost}} interface.

 Parallelization is automatically enabled if \code{OpenMP} is present. 
 Number of threads can also be manually specified via \code{nthread} parameter.

-\code{eval_metric} parameter (not listed above) is set automatically by Xgboost but can be overriden by parameter. Below is provided the list of different metric optimized by Xgboost to help you to understand how it works inside or to use them with the \code{watchlist} parameter.
+The evaluation metric is chosen automatically by Xgboost (according to the objective)
+when the \code{eval_metric} parameter is not provided.
+User may set one or several \code{eval_metric} parameters. 
+Note that when using a customized metric, only this single metric can be used.
+The folloiwing is the list of built-in metrics for which Xgboost provides optimized implementation:
  \itemize{
     \item \code{rmse} root mean square error. \url{http://en.wikipedia.org/wiki/Root_mean_square_error}
     \item \code{logloss} negative log-likelihood. \url{http://en.wikipedia.org/wiki/Log-likelihood}
     \item \code{mlogloss} multiclass logloss. \url{https://www.kaggle.com/wiki/MultiClassLogLoss}
-     \item \code{error} Binary classification error rate. It is calculated as \code{(wrong cases) / (all cases)}. For the predictions, the evaluation will regard the instances with prediction value larger than 0.5 as positive instances, and the others as negative instances.
+     \item \code{error} Binary classification error rate. It is calculated as \code{(wrong cases) / (all cases)}.
+           By default, it uses the 0.5 threshold for predicted values to define negative and positive instances.
+           Different threshold (e.g., 0.) could be specified as "error@0."
     \item \code{merror} Multiclass classification error rate. It is calculated as \code{(wrong cases) / (all cases)}.
     \item \code{auc} Area under the curve. \url{http://en.wikipedia.org/wiki/Receiver_operating_characteristic#'Area_under_curve} for ranking evaluation.
     \item \code{ndcg} Normalized Discounted Cumulative Gain (for ranking task). \url{http://en.wikipedia.org/wiki/NDCG}
  }
-  
-Full list of parameters is available in the Wiki \url{https://github.com/dmlc/xgboost/wiki/Parameters}.

-This function only accepts an \code{\link{xgb.DMatrix}} object as the input.
+The following callbacks are automatically created when certain parameters are set:
+\itemize{
+  \item \code{cb.print_evaluation} is turned on when \code{verbose > 0};
+        and the \code{print.every.n} parameter is passed to it.
+  \item \code{cb.log_evaluation} is on when \code{verbose > 0} and \code{watchlist} is present.
+  \item \code{cb.early_stop}: when \code{early.stop.round} is set.
+  \item \code{cb.save_model}: when \code{save_period > 0} is set.
+}
 }
 \examples{
 data(agaricus.train, package='xgboost')
+data(agaricus.test, package='xgboost')
+
 dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
-dtest <- dtrain
+dtest <- xgb.DMatrix(agaricus.test$data, label = agaricus.test$label)
 watchlist <- list(eval = dtest, train = dtrain)
+
+## A simple xgb.train example:
+param <- list(max.depth = 2, eta = 1, silent = 1, objective="binary:logistic", eval_metric="auc")
+bst <- xgb.train(param, dtrain, nthread = 2, nround = 2, watchlist)
+
+## An xgb.train example where custom objective and evaluation metric are used:
 logregobj <- function(preds, dtrain) {
   labels <- getinfo(dtrain, "label")
   preds <- 1/(1 + exp(-preds))
@@ -138,7 +189,23 @@ evalerror <- function(preds, dtrain) {
  err <- as.numeric(sum(labels != (preds > 0)))/length(labels)
  return(list(metric = "error", value = err))
 }
-param <- list(max.depth = 2, eta = 1, silent = 1, objective=logregobj,eval_metric=evalerror)
 bst <- xgb.train(param, dtrain, nthread = 2, nround = 2, watchlist)
+
+## An xgb.train example of using variable learning rates at each iteration:
+my_etas <- list(eta = c(0.5, 0.1))
+bst <- xgb.train(param, dtrain, nthread = 2, nround = 2, watchlist,
+                 callbacks = list(cb.reset_parameters(my_etas)))
+
+## Explicit use of the cb.log_evaluation callback allows to run 
+## xgb.train silently but still store the evaluation results:
+bst <- xgb.train(param, dtrain, nthread = 2, nround = 2, watchlist,
+                 verbose = 0, callbacks = list(cb.log_evaluation()))
+print(bst$evaluation_log)
+
+## An 'xgboost' interface example:
+bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max.depth = 2, 
+               eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
+pred <- predict(bst, agaricus.test$data)
+
 }

--- a/R-package/man/xgboost.Rd
+++ b/R-package/man/xgboost.Rd
@@ -1,83 +0,0 @@
-% Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/xgboost.R
-\name{xgboost}
-\alias{xgboost}
-\title{eXtreme Gradient Boosting (Tree) library}
-\usage{
-xgboost(data = NULL, label = NULL, missing = NA, weight = NULL,
-  params = list(), nrounds, verbose = 1, print.every.n = 1L,
-  early.stop.round = NULL, maximize = NULL, save_period = 0,
-  save_name = "xgboost.model", ...)
-}
-\arguments{
-\item{data}{takes \code{matrix}, \code{dgCMatrix}, local data file or 
-\code{xgb.DMatrix}.}
-
-\item{label}{the response variable. User should not set this field,
-if data is local data file or  \code{xgb.DMatrix}.}
-
-\item{missing}{Missing is only used when input is dense matrix, pick a float 
-value that represents missing value. Sometimes a data use 0 or other extreme value to represents missing values.}
-
-\item{weight}{a vector indicating the weight for each row of the input.}
-
-\item{params}{the list of parameters.
-
-Commonly used ones are:
-\itemize{
-  \item \code{objective} objective function, common ones are
-  \itemize{
-    \item \code{reg:linear} linear regression
-    \item \code{binary:logistic} logistic regression for classification
-  }
-  \item \code{eta} step size of each boosting step
-  \item \code{max.depth} maximum depth of the tree
-  \item \code{nthread} number of thread used in training, if not set, all threads are used
-}
-  
-  Look at \code{\link{xgb.train}} for a more complete list of parameters or \url{https://github.com/dmlc/xgboost/wiki/Parameters} for the full list.
-  
-  See also \code{demo/} for walkthrough example in R.}
-
-\item{nrounds}{the max number of iterations}
-
-\item{verbose}{If 0, xgboost will stay silent. If 1, xgboost will print 
-information of performance. If 2, xgboost will print information of both
-performance and construction progress information}
-
-\item{print.every.n}{Print every N progress messages when \code{verbose>0}. Default is 1 which means all messages are printed.}
-
-\item{early.stop.round}{If \code{NULL}, the early stopping function is not triggered. 
-If set to an integer \code{k}, training with a validation set will stop if the performance 
-keeps getting worse consecutively for \code{k} rounds.}
-
-\item{maximize}{If \code{feval} and \code{early.stop.round} are set, then \code{maximize} must be set as well.
-\code{maximize=TRUE} means the larger the evaluation score the better.}
-
-\item{save_period}{save the model to the disk in every \code{save_period} rounds, 0 means no such action.}
-
-\item{save_name}{the name or path for periodically saved model file.}
-
-\item{...}{other parameters to pass to \code{params}.}
-}
-\description{
-A simple interface for training xgboost model. Look at \code{\link{xgb.train}} function for a more advanced interface.
-}
-\details{
-This is the modeling function for Xgboost.
-
-Parallelization is automatically enabled if \code{OpenMP} is present.
-
-Number of threads can also be manually specified via \code{nthread} parameter.
-}
-\examples{
-data(agaricus.train, package='xgboost')
-data(agaricus.test, package='xgboost')
-train <- agaricus.train
-test <- agaricus.test
-bst <- xgboost(data = train$data, label = train$label, max.depth = 2, 
-               eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
-pred <- predict(bst, test$data)
-
-}
-