Update documents. (#6856)
* Add early stopping section to prediction doc. * Remove best_ntree_limit. * Better doxygen output.
This commit is contained in:
parent
d31a57cf5f
commit
a5d7094a45
@ -67,6 +67,18 @@ the 3-class classification dataset, and want to use the first 2 iterations of tr
|
|||||||
prediction, you need to provide ``iteration_range=(0, 2)``. Then the first :math:`2
|
prediction, you need to provide ``iteration_range=(0, 2)``. Then the first :math:`2
|
||||||
\times 3 \times 4` trees will be used in this prediction.
|
\times 3 \times 4` trees will be used in this prediction.
|
||||||
|
|
||||||
|
**************
|
||||||
|
Early Stopping
|
||||||
|
**************
|
||||||
|
|
||||||
|
When a model is trained with early stopping, there is an inconsistent behavior between
|
||||||
|
native Python interface and sklearn/R interfaces. By default on R and sklearn interfaces,
|
||||||
|
the ``best_iteration`` is automatically used so prediction comes from the best model. But
|
||||||
|
with the native Python interface :py:meth:`xgboost.Booster.predict` and
|
||||||
|
:py:meth:`xgboost.Booster.inplace_predict` uses the full model. Users can use
|
||||||
|
``best_iteration`` attribute with ``iteration_range`` parameter to achieve the same
|
||||||
|
behavior. Also the ``save_best`` parameter from :py:obj:`xgboost.callback.EarlyStopping`
|
||||||
|
might be useful.
|
||||||
|
|
||||||
*********
|
*********
|
||||||
Predictor
|
Predictor
|
||||||
|
|||||||
@ -183,7 +183,7 @@ Early stopping requires at least one set in ``evals``. If there's more than one,
|
|||||||
|
|
||||||
The model will train until the validation score stops improving. Validation error needs to decrease at least every ``early_stopping_rounds`` to continue training.
|
The model will train until the validation score stops improving. Validation error needs to decrease at least every ``early_stopping_rounds`` to continue training.
|
||||||
|
|
||||||
If early stopping occurs, the model will have three additional fields: ``bst.best_score``, ``bst.best_iteration`` and ``bst.best_ntree_limit``. Note that :py:meth:`xgboost.train` will return a model from the last iteration, not the best one.
|
If early stopping occurs, the model will have two additional fields: ``bst.best_score``, ``bst.best_iteration``. Note that :py:meth:`xgboost.train` will return a model from the last iteration, not the best one.
|
||||||
|
|
||||||
This works with both metrics to minimize (RMSE, log loss, etc.) and to maximize (MAP, NDCG, AUC). Note that if you specify more than one evaluation metric the last one in ``param['eval_metric']`` is used for early stopping.
|
This works with both metrics to minimize (RMSE, log loss, etc.) and to maximize (MAP, NDCG, AUC). Note that if you specify more than one evaluation metric the last one in ``param['eval_metric']`` is used for early stopping.
|
||||||
|
|
||||||
@ -198,11 +198,11 @@ A model that has been trained or loaded can perform predictions on data sets.
|
|||||||
dtest = xgb.DMatrix(data)
|
dtest = xgb.DMatrix(data)
|
||||||
ypred = bst.predict(dtest)
|
ypred = bst.predict(dtest)
|
||||||
|
|
||||||
If early stopping is enabled during training, you can get predictions from the best iteration with ``bst.best_ntree_limit``:
|
If early stopping is enabled during training, you can get predictions from the best iteration with ``bst.best_iteration``:
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
ypred = bst.predict(dtest, ntree_limit=bst.best_ntree_limit)
|
ypred = bst.predict(dtest, iteration_range=(0, bst.best_iteration))
|
||||||
|
|
||||||
Plotting
|
Plotting
|
||||||
--------
|
--------
|
||||||
|
|||||||
@ -744,13 +744,13 @@ XGB_DLL int XGBoosterPredict(BoosterHandle handle,
|
|||||||
* following available fields in the JSON object:
|
* following available fields in the JSON object:
|
||||||
*
|
*
|
||||||
* "type": [0, 6]
|
* "type": [0, 6]
|
||||||
* 0: normal prediction
|
* - 0: normal prediction
|
||||||
* 1: output margin
|
* - 1: output margin
|
||||||
* 2: predict contribution
|
* - 2: predict contribution
|
||||||
* 3: predict approximated contribution
|
* - 3: predict approximated contribution
|
||||||
* 4: predict feature interaction
|
* - 4: predict feature interaction
|
||||||
* 5: predict approximated feature interaction
|
* - 5: predict approximated feature interaction
|
||||||
* 6: predict leaf
|
* - 6: predict leaf
|
||||||
* "training": bool
|
* "training": bool
|
||||||
* Whether the prediction function is used as part of a training loop. **Not used
|
* Whether the prediction function is used as part of a training loop. **Not used
|
||||||
* for inplace prediction**.
|
* for inplace prediction**.
|
||||||
@ -773,7 +773,8 @@ XGB_DLL int XGBoosterPredict(BoosterHandle handle,
|
|||||||
* disregarding the use of multi-class model, and leaf prediction will output 4-dim
|
* disregarding the use of multi-class model, and leaf prediction will output 4-dim
|
||||||
* array representing: (n_samples, n_iterations, n_classes, n_trees_in_forest)
|
* array representing: (n_samples, n_iterations, n_classes, n_trees_in_forest)
|
||||||
*
|
*
|
||||||
* Run a normal prediction with strict output shape, 2 dim for softprob , 1 dim for others.
|
* Example JSON input for running a normal prediction with strict output shape, 2 dim
|
||||||
|
* for softprob , 1 dim for others.
|
||||||
* \code
|
* \code
|
||||||
* {
|
* {
|
||||||
* "type": 0,
|
* "type": 0,
|
||||||
|
|||||||
@ -1683,7 +1683,9 @@ class Booster(object):
|
|||||||
iteration_range: Tuple[int, int] = (0, 0),
|
iteration_range: Tuple[int, int] = (0, 0),
|
||||||
strict_shape: bool = False,
|
strict_shape: bool = False,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""Predict with data.
|
"""Predict with data. The full model will be used unless `iteration_range` is specified,
|
||||||
|
meaning user have to either slice the model or use the ``best_iteration``
|
||||||
|
attribute to get prediction from best model returned from early stopping.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
|
|||||||
@ -794,8 +794,8 @@ class XGBModel(XGBModelBase):
|
|||||||
base_margin: Optional[array_like] = None,
|
base_margin: Optional[array_like] = None,
|
||||||
iteration_range: Optional[Tuple[int, int]] = None,
|
iteration_range: Optional[Tuple[int, int]] = None,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""
|
"""Predict with `X`. If the model is trained with early stopping, then `best_iteration`
|
||||||
Predict with `X`.
|
is used automatically.
|
||||||
|
|
||||||
.. note:: This function is only thread safe for `gbtree` and `dart`.
|
.. note:: This function is only thread safe for `gbtree` and `dart`.
|
||||||
|
|
||||||
@ -819,6 +819,7 @@ class XGBModel(XGBModelBase):
|
|||||||
used in this prediction.
|
used in this prediction.
|
||||||
|
|
||||||
.. versionadded:: 1.4.0
|
.. versionadded:: 1.4.0
|
||||||
|
|
||||||
Returns
|
Returns
|
||||||
-------
|
-------
|
||||||
prediction
|
prediction
|
||||||
@ -860,7 +861,8 @@ class XGBModel(XGBModelBase):
|
|||||||
ntree_limit: int = 0,
|
ntree_limit: int = 0,
|
||||||
iteration_range: Optional[Tuple[int, int]] = None
|
iteration_range: Optional[Tuple[int, int]] = None
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""Return the predicted leaf every tree for each sample.
|
"""Return the predicted leaf every tree for each sample. If the model is trained with
|
||||||
|
early stopping, then `best_iteration` is used automatically.
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
@ -879,6 +881,7 @@ class XGBModel(XGBModelBase):
|
|||||||
For each datapoint x in X and for each tree, return the index of the
|
For each datapoint x in X and for each tree, return the index of the
|
||||||
leaf x ends up in. Leaves are numbered within
|
leaf x ends up in. Leaves are numbered within
|
||||||
``[0; 2**(self.max_depth+1))``, possibly with gaps in the numbering.
|
``[0; 2**(self.max_depth+1))``, possibly with gaps in the numbering.
|
||||||
|
|
||||||
"""
|
"""
|
||||||
iteration_range = _convert_ntree_limit(
|
iteration_range = _convert_ntree_limit(
|
||||||
self.get_booster(), ntree_limit, iteration_range
|
self.get_booster(), ntree_limit, iteration_range
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user