Move skl eval_metric and early_stopping rounds to model params. (#6751)
A new parameter `custom_metric` is added to `train` and `cv` to distinguish the behaviour from the old `feval`. And `feval` is deprecated. The new `custom_metric` receives transformed prediction when the built-in objective is used. This enables XGBoost to use cost functions from other libraries like scikit-learn directly without going through the definition of the link function. `eval_metric` and `early_stopping_rounds` in sklearn interface are moved from `fit` to `__init__` and is now saved as part of the scikit-learn model. The old ones in `fit` function are now deprecated. The new `eval_metric` in `__init__` has the same new behaviour as `custom_metric`. Added more detailed documents for the behaviour of custom objective and metric.
This commit is contained in:
@@ -2,6 +2,16 @@
|
||||
Custom Objective and Evaluation Metric
|
||||
######################################
|
||||
|
||||
**Contents**
|
||||
|
||||
.. contents::
|
||||
:backlinks: none
|
||||
:local:
|
||||
|
||||
********
|
||||
Overview
|
||||
********
|
||||
|
||||
XGBoost is designed to be an extensible library. One way to extend it is by providing our
|
||||
own objective function for training and corresponding metric for performance monitoring.
|
||||
This document introduces implementing a customized elementwise evaluation metric and
|
||||
@@ -11,12 +21,8 @@ concepts should be readily applicable to other language bindings.
|
||||
.. note::
|
||||
|
||||
* The ranking task does not support customized functions.
|
||||
* The customized functions defined here are only applicable to single node training.
|
||||
Distributed environment requires syncing with ``xgboost.rabit``, the interface is
|
||||
subject to change hence beyond the scope of this tutorial.
|
||||
* We also plan to improve the interface for multi-classes objective in the future.
|
||||
|
||||
In the following sections, we will provide a step by step walk through of implementing
|
||||
In the following two sections, we will provide a step by step walk through of implementing
|
||||
``Squared Log Error(SLE)`` objective function:
|
||||
|
||||
.. math::
|
||||
@@ -30,7 +36,10 @@ and its default metric ``Root Mean Squared Log Error(RMSLE)``:
|
||||
Although XGBoost has native support for said functions, using it for demonstration
|
||||
provides us the opportunity of comparing the result from our own implementation and the
|
||||
one from XGBoost internal for learning purposes. After finishing this tutorial, we should
|
||||
be able to provide our own functions for rapid experiments.
|
||||
be able to provide our own functions for rapid experiments. And at the end, we will
|
||||
provide some notes on non-identy link function along with examples of using custom metric
|
||||
and objective with `scikit-learn` interface.
|
||||
with scikit-learn interface.
|
||||
|
||||
*****************************
|
||||
Customized Objective Function
|
||||
@@ -125,12 +134,12 @@ We will be able to see XGBoost printing something like:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[0] dtrain-PyRMSLE:1.37153 dtest-PyRMSLE:1.31487
|
||||
[1] dtrain-PyRMSLE:1.26619 dtest-PyRMSLE:1.20899
|
||||
[2] dtrain-PyRMSLE:1.17508 dtest-PyRMSLE:1.11629
|
||||
[3] dtrain-PyRMSLE:1.09836 dtest-PyRMSLE:1.03871
|
||||
[4] dtrain-PyRMSLE:1.03557 dtest-PyRMSLE:0.977186
|
||||
[5] dtrain-PyRMSLE:0.985783 dtest-PyRMSLE:0.93057
|
||||
[0] dtrain-PyRMSLE:1.37153 dtest-PyRMSLE:1.31487
|
||||
[1] dtrain-PyRMSLE:1.26619 dtest-PyRMSLE:1.20899
|
||||
[2] dtrain-PyRMSLE:1.17508 dtest-PyRMSLE:1.11629
|
||||
[3] dtrain-PyRMSLE:1.09836 dtest-PyRMSLE:1.03871
|
||||
[4] dtrain-PyRMSLE:1.03557 dtest-PyRMSLE:0.977186
|
||||
[5] dtrain-PyRMSLE:0.985783 dtest-PyRMSLE:0.93057
|
||||
...
|
||||
|
||||
Notice that the parameter ``disable_default_eval_metric`` is used to suppress the default metric
|
||||
@@ -138,11 +147,164 @@ in XGBoost.
|
||||
|
||||
For fully reproducible source code and comparison plots, see `custom_rmsle.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_rmsle.py>`_.
|
||||
|
||||
*********************
|
||||
Reverse Link Function
|
||||
*********************
|
||||
|
||||
******************************
|
||||
Multi-class objective function
|
||||
******************************
|
||||
When using builtin objective, the raw prediction is transformed according to the objective
|
||||
function. When custom objective is provided XGBoost doesn't know its link function so the
|
||||
user is responsible for making the transformation for both objective and custom evaluation
|
||||
metric. For objective with identiy link like ``squared error`` this is trivial, but for
|
||||
other link functions like log link or inverse link the difference is significant.
|
||||
|
||||
A similar demo for multi-class objective function is also available, see
|
||||
`demo/guide-python/custom_softmax.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
|
||||
for details.
|
||||
For the Python package, the behaviour of prediction can be controlled by the
|
||||
``output_margin`` parameter in ``predict`` function. When using the ``custom_metric``
|
||||
parameter without a custom objective, the metric function will receive transformed
|
||||
prediction since the objective is defined by XGBoost. However, when custom objective is
|
||||
also provided along with that metric, then both the objective and custom metric will
|
||||
recieve raw prediction. Following example provides a comparison between two different
|
||||
behavior with a multi-class classification model. Firstly we define 2 different Python
|
||||
metric functions implementing the same underlying metric for comparison,
|
||||
`merror_with_transform` is used when custom objective is also used, otherwise the simpler
|
||||
`merror` is preferred since XGBoost can perform the transformation itself.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import xgboost as xgb
|
||||
import numpy as np
|
||||
|
||||
def merror_with_transform(predt: np.ndarray, dtrain: xgb.DMatrix):
|
||||
"""Used when custom objective is supplied."""
|
||||
y = dtrain.get_label()
|
||||
n_classes = predt.size // y.shape[0]
|
||||
# Like custom objective, the predt is untransformed leaf weight when custom objective
|
||||
# is provided.
|
||||
|
||||
# With the use of `custom_metric` parameter in train function, custom metric receives
|
||||
# raw input only when custom objective is also being used. Otherwise custom metric
|
||||
# will receive transformed prediction.
|
||||
assert predt.shape == (d_train.num_row(), n_classes)
|
||||
out = np.zeros(dtrain.num_row())
|
||||
for r in range(predt.shape[0]):
|
||||
i = np.argmax(predt[r])
|
||||
out[r] = i
|
||||
|
||||
assert y.shape == out.shape
|
||||
|
||||
errors = np.zeros(dtrain.num_row())
|
||||
errors[y != out] = 1.0
|
||||
return 'PyMError', np.sum(errors) / dtrain.num_row()
|
||||
|
||||
The above function is only needed when we want to use custom objective and XGBoost doesn't
|
||||
know how to transform the prediction. The normal implementation for multi-class error
|
||||
function is:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def merror(predt: np.ndarray, dtrain: xgb.DMatrix):
|
||||
"""Used when there's no custom objective."""
|
||||
# No need to do transform, XGBoost handles it internally.
|
||||
errors = np.zeros(dtrain.num_row())
|
||||
errors[y != out] = 1.0
|
||||
return 'PyMError', np.sum(errors) / dtrain.num_row()
|
||||
|
||||
|
||||
Next we need the custom softprob objective:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def softprob_obj(predt: np.ndarray, data: xgb.DMatrix):
|
||||
"""Loss function. Computing the gradient and approximated hessian (diagonal).
|
||||
Reimplements the `multi:softprob` inside XGBoost.
|
||||
"""
|
||||
|
||||
# Full implementation is available in the Python demo script linked below
|
||||
...
|
||||
|
||||
return grad, hess
|
||||
|
||||
Lastly we can train the model using ``obj`` and ``custom_metric`` parameters:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
Xy = xgb.DMatrix(X, y)
|
||||
booster = xgb.train(
|
||||
{"num_class": kClasses, "disable_default_eval_metric": True},
|
||||
m,
|
||||
num_boost_round=kRounds,
|
||||
obj=softprob_obj,
|
||||
custom_metric=merror_with_transform,
|
||||
evals_result=custom_results,
|
||||
evals=[(m, "train")],
|
||||
)
|
||||
|
||||
Or if you don't need the custom objective and just want to supply a metric that's not
|
||||
available in XGBoost:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
booster = xgb.train(
|
||||
{
|
||||
"num_class": kClasses,
|
||||
"disable_default_eval_metric": True,
|
||||
"objective": "multi:softmax",
|
||||
},
|
||||
m,
|
||||
num_boost_round=kRounds,
|
||||
# Use a simpler metric implementation.
|
||||
custom_metric=merror,
|
||||
evals_result=custom_results,
|
||||
evals=[(m, "train")],
|
||||
)
|
||||
|
||||
We use ``multi:softmax`` to illustrate the differences of transformed prediction. With
|
||||
``softprob`` the output prediction array has shape ``(n_samples, n_classes)`` while for
|
||||
``softmax`` it's ``(n_samples, )``. A demo for multi-class objective function is also
|
||||
available at `demo/guide-python/custom_softmax.py
|
||||
<https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
|
||||
|
||||
|
||||
**********************
|
||||
Scikit-Learn Interface
|
||||
**********************
|
||||
|
||||
|
||||
The scikit-learn interface of XGBoost has some utilities to improve the integration with
|
||||
standard scikit-learn functions. For instance, after XGBoost 1.5.1 users can use the cost
|
||||
function (not scoring functions) from scikit-learn out of the box:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from sklearn.datasets import load_diabetes
|
||||
from sklearn.metrics import mean_absolute_error
|
||||
X, y = load_diabetes(return_X_y=True)
|
||||
reg = xgb.XGBRegressor(
|
||||
tree_method="hist",
|
||||
eval_metric=mean_absolute_error,
|
||||
)
|
||||
reg.fit(X, y, eval_set=[(X, y)])
|
||||
|
||||
Also, for custom objective function, users can define the objective without having to
|
||||
access ``DMatrix``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def softprob_obj(labels: np.ndarray, predt: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
|
||||
rows = labels.shape[0]
|
||||
grad = np.zeros((rows, classes), dtype=float)
|
||||
hess = np.zeros((rows, classes), dtype=float)
|
||||
eps = 1e-6
|
||||
for r in range(predt.shape[0]):
|
||||
target = labels[r]
|
||||
p = softmax(predt[r, :])
|
||||
for c in range(predt.shape[1]):
|
||||
g = p[c] - 1.0 if c == target else p[c]
|
||||
h = max((2.0 * p[c] * (1.0 - p[c])).item(), eps)
|
||||
grad[r, c] = g
|
||||
hess[r, c] = h
|
||||
|
||||
grad = grad.reshape((rows * classes, 1))
|
||||
hess = hess.reshape((rows * classes, 1))
|
||||
return grad, hess
|
||||
|
||||
clf = xgb.XGBClassifier(tree_method="hist", objective=softprob_obj)
|
||||
|
||||
Reference in New Issue
Block a user