Update documents. (#6856)

* Add early stopping section to prediction doc. * Remove best_ntree_limit. * Better doxygen output.
2021-04-16 12:41:03 +08:00
parent d31a57cf5f
commit a5d7094a45
6 changed files with 34 additions and 16 deletions
--- a/doc/prediction.rst
+++ b/doc/prediction.rst
@@ -67,6 +67,18 @@ the 3-class classification dataset, and want to use the first 2 iterations of tr
 prediction, you need to provide ``iteration_range=(0, 2)``.  Then the first :math:`2
 \times 3 \times 4` trees will be used in this prediction.

+**************
+Early Stopping
+**************
+
+When a model is trained with early stopping, there is an inconsistent behavior between
+native Python interface and sklearn/R interfaces.  By default on R and sklearn interfaces,
+the ``best_iteration`` is automatically used so prediction comes from the best model.  But
+with the native Python interface :py:meth:`xgboost.Booster.predict` and
+:py:meth:`xgboost.Booster.inplace_predict` uses the full model.  Users can use
+``best_iteration`` attribute with ``iteration_range`` parameter to achieve the same
+behavior.  Also the ``save_best`` parameter from :py:obj:`xgboost.callback.EarlyStopping`
+might be useful.

 *********
 Predictor
--- a/doc/python/python_intro.rst
+++ b/doc/python/python_intro.rst
@@ -183,7 +183,7 @@ Early stopping requires at least one set in ``evals``. If there's more than one,

 The model will train until the validation score stops improving. Validation error needs to decrease at least every ``early_stopping_rounds`` to continue training.

-If early stopping occurs, the model will have three additional fields: ``bst.best_score``, ``bst.best_iteration`` and ``bst.best_ntree_limit``. Note that :py:meth:`xgboost.train` will return a model from the last iteration, not the best one.
+If early stopping occurs, the model will have two additional fields: ``bst.best_score``, ``bst.best_iteration``.  Note that :py:meth:`xgboost.train` will return a model from the last iteration, not the best one.

 This works with both metrics to minimize (RMSE, log loss, etc.) and to maximize (MAP, NDCG, AUC). Note that if you specify more than one evaluation metric the last one in ``param['eval_metric']`` is used for early stopping.

@@ -198,11 +198,11 @@ A model that has been trained or loaded can perform predictions on data sets.
  dtest = xgb.DMatrix(data)
  ypred = bst.predict(dtest)

-If early stopping is enabled during training, you can get predictions from the best iteration with ``bst.best_ntree_limit``:
+If early stopping is enabled during training, you can get predictions from the best iteration with ``bst.best_iteration``:

 .. code-block:: python

-  ypred = bst.predict(dtest, ntree_limit=bst.best_ntree_limit)
+  ypred = bst.predict(dtest, iteration_range=(0, bst.best_iteration))

 Plotting
 --------
--- a/doc/tutorials/dask.rst
+++ b/doc/tutorials/dask.rst
@@ -176,7 +176,7 @@ One simple optimization for running consecutive predictions is using
        shap_f = xgb.dask.predict(client, booster_f, X, pred_contribs=True)
        futures.append(shap_f)

-  results = client.gather(futures)
+    results = client.gather(futures)


 This is only available on functional interface, as the Scikit-Learn wrapper doesn't know