Move skl eval_metric and early_stopping rounds to model params. (#6751)

A new parameter `custom_metric` is added to `train` and `cv` to distinguish the behaviour from the old `feval`.  And `feval` is deprecated.  The new `custom_metric` receives transformed prediction when the built-in objective is used.  This enables XGBoost to use cost functions from other libraries like scikit-learn directly without going through the definition of the link function.

`eval_metric` and `early_stopping_rounds` in sklearn interface are moved from `fit` to `__init__` and is now saved as part of the scikit-learn model.  The old ones in `fit` function are now deprecated. The new `eval_metric` in `__init__` has the same new behaviour as `custom_metric`.

Added more detailed documents for the behaviour of custom objective and metric.
This commit is contained in:
Jiaming Yuan
2021-10-28 17:20:20 +08:00
committed by GitHub
parent 6b074add66
commit 45aef75cca
13 changed files with 685 additions and 190 deletions

View File

@@ -2,6 +2,16 @@
Custom Objective and Evaluation Metric
######################################
**Contents**
.. contents::
:backlinks: none
:local:
********
Overview
********
XGBoost is designed to be an extensible library. One way to extend it is by providing our
own objective function for training and corresponding metric for performance monitoring.
This document introduces implementing a customized elementwise evaluation metric and
@@ -11,12 +21,8 @@ concepts should be readily applicable to other language bindings.
.. note::
* The ranking task does not support customized functions.
* The customized functions defined here are only applicable to single node training.
Distributed environment requires syncing with ``xgboost.rabit``, the interface is
subject to change hence beyond the scope of this tutorial.
* We also plan to improve the interface for multi-classes objective in the future.
In the following sections, we will provide a step by step walk through of implementing
In the following two sections, we will provide a step by step walk through of implementing
``Squared Log Error(SLE)`` objective function:
.. math::
@@ -30,7 +36,10 @@ and its default metric ``Root Mean Squared Log Error(RMSLE)``:
Although XGBoost has native support for said functions, using it for demonstration
provides us the opportunity of comparing the result from our own implementation and the
one from XGBoost internal for learning purposes. After finishing this tutorial, we should
be able to provide our own functions for rapid experiments.
be able to provide our own functions for rapid experiments. And at the end, we will
provide some notes on non-identy link function along with examples of using custom metric
and objective with `scikit-learn` interface.
with scikit-learn interface.
*****************************
Customized Objective Function
@@ -125,12 +134,12 @@ We will be able to see XGBoost printing something like:
.. code-block:: none
[0] dtrain-PyRMSLE:1.37153 dtest-PyRMSLE:1.31487
[1] dtrain-PyRMSLE:1.26619 dtest-PyRMSLE:1.20899
[2] dtrain-PyRMSLE:1.17508 dtest-PyRMSLE:1.11629
[3] dtrain-PyRMSLE:1.09836 dtest-PyRMSLE:1.03871
[4] dtrain-PyRMSLE:1.03557 dtest-PyRMSLE:0.977186
[5] dtrain-PyRMSLE:0.985783 dtest-PyRMSLE:0.93057
[0] dtrain-PyRMSLE:1.37153 dtest-PyRMSLE:1.31487
[1] dtrain-PyRMSLE:1.26619 dtest-PyRMSLE:1.20899
[2] dtrain-PyRMSLE:1.17508 dtest-PyRMSLE:1.11629
[3] dtrain-PyRMSLE:1.09836 dtest-PyRMSLE:1.03871
[4] dtrain-PyRMSLE:1.03557 dtest-PyRMSLE:0.977186
[5] dtrain-PyRMSLE:0.985783 dtest-PyRMSLE:0.93057
...
Notice that the parameter ``disable_default_eval_metric`` is used to suppress the default metric
@@ -138,11 +147,164 @@ in XGBoost.
For fully reproducible source code and comparison plots, see `custom_rmsle.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_rmsle.py>`_.
*********************
Reverse Link Function
*********************
******************************
Multi-class objective function
******************************
When using builtin objective, the raw prediction is transformed according to the objective
function. When custom objective is provided XGBoost doesn't know its link function so the
user is responsible for making the transformation for both objective and custom evaluation
metric. For objective with identiy link like ``squared error`` this is trivial, but for
other link functions like log link or inverse link the difference is significant.
A similar demo for multi-class objective function is also available, see
`demo/guide-python/custom_softmax.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
for details.
For the Python package, the behaviour of prediction can be controlled by the
``output_margin`` parameter in ``predict`` function. When using the ``custom_metric``
parameter without a custom objective, the metric function will receive transformed
prediction since the objective is defined by XGBoost. However, when custom objective is
also provided along with that metric, then both the objective and custom metric will
recieve raw prediction. Following example provides a comparison between two different
behavior with a multi-class classification model. Firstly we define 2 different Python
metric functions implementing the same underlying metric for comparison,
`merror_with_transform` is used when custom objective is also used, otherwise the simpler
`merror` is preferred since XGBoost can perform the transformation itself.
.. code-block:: python
import xgboost as xgb
import numpy as np
def merror_with_transform(predt: np.ndarray, dtrain: xgb.DMatrix):
"""Used when custom objective is supplied."""
y = dtrain.get_label()
n_classes = predt.size // y.shape[0]
# Like custom objective, the predt is untransformed leaf weight when custom objective
# is provided.
# With the use of `custom_metric` parameter in train function, custom metric receives
# raw input only when custom objective is also being used. Otherwise custom metric
# will receive transformed prediction.
assert predt.shape == (d_train.num_row(), n_classes)
out = np.zeros(dtrain.num_row())
for r in range(predt.shape[0]):
i = np.argmax(predt[r])
out[r] = i
assert y.shape == out.shape
errors = np.zeros(dtrain.num_row())
errors[y != out] = 1.0
return 'PyMError', np.sum(errors) / dtrain.num_row()
The above function is only needed when we want to use custom objective and XGBoost doesn't
know how to transform the prediction. The normal implementation for multi-class error
function is:
.. code-block:: python
def merror(predt: np.ndarray, dtrain: xgb.DMatrix):
"""Used when there's no custom objective."""
# No need to do transform, XGBoost handles it internally.
errors = np.zeros(dtrain.num_row())
errors[y != out] = 1.0
return 'PyMError', np.sum(errors) / dtrain.num_row()
Next we need the custom softprob objective:
.. code-block:: python
def softprob_obj(predt: np.ndarray, data: xgb.DMatrix):
"""Loss function. Computing the gradient and approximated hessian (diagonal).
Reimplements the `multi:softprob` inside XGBoost.
"""
# Full implementation is available in the Python demo script linked below
...
return grad, hess
Lastly we can train the model using ``obj`` and ``custom_metric`` parameters:
.. code-block:: python
Xy = xgb.DMatrix(X, y)
booster = xgb.train(
{"num_class": kClasses, "disable_default_eval_metric": True},
m,
num_boost_round=kRounds,
obj=softprob_obj,
custom_metric=merror_with_transform,
evals_result=custom_results,
evals=[(m, "train")],
)
Or if you don't need the custom objective and just want to supply a metric that's not
available in XGBoost:
.. code-block:: python
booster = xgb.train(
{
"num_class": kClasses,
"disable_default_eval_metric": True,
"objective": "multi:softmax",
},
m,
num_boost_round=kRounds,
# Use a simpler metric implementation.
custom_metric=merror,
evals_result=custom_results,
evals=[(m, "train")],
)
We use ``multi:softmax`` to illustrate the differences of transformed prediction. With
``softprob`` the output prediction array has shape ``(n_samples, n_classes)`` while for
``softmax`` it's ``(n_samples, )``. A demo for multi-class objective function is also
available at `demo/guide-python/custom_softmax.py
<https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
**********************
Scikit-Learn Interface
**********************
The scikit-learn interface of XGBoost has some utilities to improve the integration with
standard scikit-learn functions. For instance, after XGBoost 1.5.1 users can use the cost
function (not scoring functions) from scikit-learn out of the box:
.. code-block:: python
from sklearn.datasets import load_diabetes
from sklearn.metrics import mean_absolute_error
X, y = load_diabetes(return_X_y=True)
reg = xgb.XGBRegressor(
tree_method="hist",
eval_metric=mean_absolute_error,
)
reg.fit(X, y, eval_set=[(X, y)])
Also, for custom objective function, users can define the objective without having to
access ``DMatrix``:
.. code-block:: python
def softprob_obj(labels: np.ndarray, predt: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
rows = labels.shape[0]
grad = np.zeros((rows, classes), dtype=float)
hess = np.zeros((rows, classes), dtype=float)
eps = 1e-6
for r in range(predt.shape[0]):
target = labels[r]
p = softmax(predt[r, :])
for c in range(predt.shape[1]):
g = p[c] - 1.0 if c == target else p[c]
h = max((2.0 * p[c] * (1.0 - p[c])).item(), eps)
grad[r, c] = g
hess[r, c] = h
grad = grad.reshape((rows * classes, 1))
hess = hess.reshape((rows * classes, 1))
return grad, hess
clf = xgb.XGBClassifier(tree_method="hist", objective=softprob_obj)