[R] Provide better guidance for persisting XGBoost model (#5964)
* [R] Provide better guidance for persisting XGBoost model * Update saving_model.rst * Add a paragraph about xgb.serialize()
This commit is contained in:
committed by
GitHub
parent
bf2990e773
commit
5a2dcd1c33
@@ -20,7 +20,7 @@ Non-null \code{feature_names} could be provided to override those in the model.}
|
||||
|
||||
\item{model}{object of class \code{xgb.Booster}}
|
||||
|
||||
\item{text}{\code{character} vector previously generated by the \code{xgb.dump}
|
||||
\item{text}{\code{character} vector previously generated by the \code{xgb.dump}
|
||||
function (where parameter \code{with_stats = TRUE} should have been set).
|
||||
\code{text} takes precedence over \code{model}.}
|
||||
|
||||
@@ -53,10 +53,10 @@ The columns of the \code{data.table} are:
|
||||
\item \code{Quality}: either the split gain (change in loss) or the leaf value
|
||||
\item \code{Cover}: metric related to the number of observation either seen by a split
|
||||
or collected by a leaf during training.
|
||||
}
|
||||
}
|
||||
|
||||
When \code{use_int_id=FALSE}, columns "Yes", "No", and "Missing" point to model-wide node identifiers
|
||||
in the "ID" column. When \code{use_int_id=TRUE}, those columns point to node identifiers from
|
||||
in the "ID" column. When \code{use_int_id=TRUE}, those columns point to node identifiers from
|
||||
the corresponding trees in the "Node" column.
|
||||
}
|
||||
\description{
|
||||
@@ -67,17 +67,17 @@ Parse a boosted tree model text dump into a \code{data.table} structure.
|
||||
|
||||
data(agaricus.train, package='xgboost')
|
||||
|
||||
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2,
|
||||
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2,
|
||||
eta = 1, nthread = 2, nrounds = 2,objective = "binary:logistic")
|
||||
|
||||
(dt <- xgb.model.dt.tree(colnames(agaricus.train$data), bst))
|
||||
|
||||
# This bst model already has feature_names stored with it, so those would be used when
|
||||
# This bst model already has feature_names stored with it, so those would be used when
|
||||
# feature_names is not set:
|
||||
(dt <- xgb.model.dt.tree(model = bst))
|
||||
|
||||
# How to match feature names of splits that are following a current 'Yes' branch:
|
||||
|
||||
merge(dt, dt[, .(ID, Y.Feature=Feature)], by.x='Yes', by.y='ID', all.x=TRUE)[order(Tree,Node)]
|
||||
|
||||
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user