[doc] Clarify prediction function. (#6813)
This commit is contained in:
parent
b1fdb220f4
commit
0cced530ea
@ -22,6 +22,7 @@ Contents
|
||||
XGBoost User Forum <https://discuss.xgboost.ai>
|
||||
GPU support <gpu/index>
|
||||
parameter
|
||||
prediction
|
||||
treemethod
|
||||
Python package <python/index>
|
||||
R package <R-package/index>
|
||||
|
||||
148
doc/prediction.rst
Normal file
148
doc/prediction.rst
Normal file
@ -0,0 +1,148 @@
|
||||
.. _predict_api:
|
||||
|
||||
##########
|
||||
Prediction
|
||||
##########
|
||||
|
||||
There are a number of prediction functions in XGBoost with various parameters. This
|
||||
document attempts to clarify some of confusions around prediction with a focus on the
|
||||
Python binding.
|
||||
|
||||
******************
|
||||
Prediction Options
|
||||
******************
|
||||
|
||||
There are a number of different prediction options for the
|
||||
:py:meth:`xgboost.Booster.predict` method, ranging from ``pred_contribs`` to
|
||||
``pred_leaf``. The output shape depends on types of prediction. Also for multi-class
|
||||
classification problem, XGBoost builds one tree for each class and the trees for each
|
||||
class are called a "group" of trees, so output dimension may change due to used model.
|
||||
After 1.4 release, we added a new parameter called ``strict_shape``, one can set it to
|
||||
``True`` to indicate a more restricted output is desired. Assuming you are using
|
||||
:py:obj:`xgboost.Booster`, here is a list of possible returns:
|
||||
|
||||
- When using normal prediction with ``strict_shape`` set to ``True``:
|
||||
|
||||
Output is a 2-dim array with first dimension as rows and second as groups. For
|
||||
regression/survival/ranking/binary classification this is equivalent to a column vector
|
||||
with ``shape[1] == 1``. But for multi-class with ``multi:softprob`` the number of
|
||||
columns equals to number of classes. If strict_shape is set to False then XGBoost might
|
||||
output 1 or 2 dim array.
|
||||
|
||||
- When using ``output_margin`` to avoid transformation and ``strict_shape`` is set to ``True``:
|
||||
|
||||
Similar to the previous case, output is a 2-dim array, except for that ``multi:softmax``
|
||||
has equivalent output of ``multi:softprob`` due to dropped transformation. If strict
|
||||
shape is set to False then output can have 1 or 2 dim depending on used model.
|
||||
|
||||
- When using ``preds_contribs`` with ``strict_shape`` set to ``True``:
|
||||
|
||||
Output is a 3-dim array, with ``(rows, groups, columns + 1)`` as shape. Whether
|
||||
``approx_contribs`` is used does not change the output shape. If the strict shape
|
||||
parameter is not set, it can be a 2 or 3 dimension array depending on whether
|
||||
multi-class model is being used.
|
||||
|
||||
- When using ``preds_interactions`` with ``strict_shape`` set to ``True``:
|
||||
|
||||
Output is a 4-dim array, with ``(rows, groups, columns + 1, columns + 1)`` as shape.
|
||||
Like the predict contribution case, whether ``approx_contribs`` is used does not change
|
||||
the output shape. If strict shape is set to False, it can have 3 or 4 dims depending on
|
||||
the underlying model.
|
||||
|
||||
- When using ``pred_leaf`` with ``strict_shape`` set to ``True``:
|
||||
|
||||
Output is a 4-dim array with ``(n_samples, n_iterations, n_classes, n_trees_in_forest)``
|
||||
as shape. ``n_trees_in_forest`` is specified by the ``numb_parallel_tree`` during
|
||||
training. When strict shape is set to False, output is a 2-dim array with last 3 dims
|
||||
concatenated into 1. When using ``apply`` method in scikit learn interface, this is set
|
||||
to False by default.
|
||||
|
||||
|
||||
Other than these prediction types, there's also a parameter called ``iteration_range``,
|
||||
which is similar to model slicing. But instead of actually splitting up the model into
|
||||
multiple stacks, it simply returns the prediction formed by the trees within range.
|
||||
Number of trees created in each iteration eqauls to :math:`trees_i = num\_class \times
|
||||
num\_parallel\_tree`. So if you are training a boosted random forest with size of 4, on
|
||||
the 3-class classification dataset, and want to use the first 2 iterations of trees for
|
||||
prediction, you need to provide ``iteration_range=(0, 2)``. Then the first :math:`2
|
||||
\times 3 \times 4` trees will be used in this prediction.
|
||||
|
||||
|
||||
*********
|
||||
Predictor
|
||||
*********
|
||||
|
||||
There are 2 predictors in XGBoost (3 if you have the one-api plugin enabled), namely
|
||||
``cpu_predictor`` and ``gpu_predictor``. The default option is ``auto`` so that XGBoost
|
||||
can employ some heuristics for saving GPU memory during training. They might have slight
|
||||
different outputs due to floating point errors.
|
||||
|
||||
|
||||
***********
|
||||
Base Margin
|
||||
***********
|
||||
|
||||
There's a training parameter in XGBoost called ``base_score``, and a meta data for
|
||||
``DMatrix`` called ``base_margin`` (which can be set in ``fit`` method if you are using
|
||||
scikit-learn interface). They specifies the global bias for boosted model. If the latter
|
||||
is supplied then former is ignored. ``base_margin`` can be used to train XGBoost model
|
||||
based on other models. See demos on boosting from predictions.
|
||||
|
||||
*****************
|
||||
Staged Prediction
|
||||
*****************
|
||||
|
||||
Using the native interface with ``DMatrix``, prediction can be staged (or cached). For
|
||||
example, one can first predict on the first 4 trees then run prediction on 8 trees. After
|
||||
running the first prediction, result from first 4 trees are cached so when you run the
|
||||
prediction with 8 trees XGBoost can reuse the result from previous prediction. The cache
|
||||
expires automatically upon next prediction, train or evaluation if the cached ``DMatrix``
|
||||
object is expired (like going out of scope and being collected by garbage collector in
|
||||
your language environment).
|
||||
|
||||
*******************
|
||||
In-place Prediction
|
||||
*******************
|
||||
|
||||
Traditionally XGBoost accepts only ``DMatrix`` for prediction, with wrappers like
|
||||
scikit-learn interface the construction happens internally. We added support for in-place
|
||||
predict to bypass the construction of ``DMatrix``, which is slow and memory consuming.
|
||||
The new predict function has limited features but is often sufficient for simple inference
|
||||
tasks. It accepts some commonly found data types in Python like :py:obj:`numpy.ndarray`,
|
||||
:py:obj:`scipy.sparse.csr_matrix` and :py:obj:`cudf.DataFrame` instead of
|
||||
:py:obj:`xgboost.DMatrix`. You can call :py:meth:`xgboost.Booster.inplace_predict` to use
|
||||
it. Be aware that the output of in-place prediction depends on input data type, when
|
||||
input is on GPU data output is :py:obj:`cupy.ndarray`, otherwise a :py:obj:`numpy.ndarray`
|
||||
is returned.
|
||||
|
||||
****************
|
||||
Categorical Data
|
||||
****************
|
||||
|
||||
Other than users performing encoding, XGBoost has experimental support for categorical
|
||||
data using ``gpu_hist`` and ``gpu_predictor``. No special operation needs to be done on
|
||||
input test data since the information about categories is encoded into the model during
|
||||
training.
|
||||
|
||||
*************
|
||||
Thread Safety
|
||||
*************
|
||||
|
||||
After 1.4 release, all prediction functions including normal ``predict`` with various
|
||||
parameters like shap value computation and ``inplace_predict`` are thread safe when
|
||||
underlying booster is ``gbtree`` or ``dart``, which means as long as tree model is used,
|
||||
prediction itself should thread safe. But the safety is only guaranteed with prediction.
|
||||
If one tries to train a model in one thread and provide prediction at the other using the
|
||||
same model the behaviour is undefined. This happens easier than one might expect, for
|
||||
instance we might accidientally call ``clf.set_params()`` inside a predict function:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
def predict_fn(clf: xgb.XGBClassifier, X):
|
||||
X = preprocess(X)
|
||||
clf.set_params(predictor="gpu_predictor") # NOT safe!
|
||||
clf.set_params(n_jobs=1) # NOT safe!
|
||||
return clf.predict_proba(X, iteration_range=(0, 10))
|
||||
|
||||
with ThreadPoolExecutor(max_workers=10) as e:
|
||||
e.submit(predict_fn, ...)
|
||||
@ -1616,12 +1616,11 @@ class Booster(object):
|
||||
) -> np.ndarray:
|
||||
"""Predict with data.
|
||||
|
||||
.. note:: This function is not thread safe except for ``gbtree`` booster.
|
||||
.. note::
|
||||
|
||||
When using booster other than ``gbtree``, predict can only be called from one
|
||||
thread. If you want to run prediction using multiple thread, call
|
||||
:py:meth:`xgboost.Booster.copy` to make copies of model object and then call
|
||||
``predict()``.
|
||||
See `Prediction
|
||||
<https://xgboost.readthedocs.io/en/latest/tutorials/prediction.html>`_
|
||||
for issues like thread safety and a summary of outputs from this function.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
@ -1665,8 +1664,11 @@ class Booster(object):
|
||||
feature_names are the same.
|
||||
|
||||
training :
|
||||
Whether the prediction value is used for training. This can effect
|
||||
`dart` booster, which performs dropouts during training iterations.
|
||||
Whether the prediction value is used for training. This can effect `dart`
|
||||
booster, which performs dropouts during training iterations but use all trees
|
||||
for inference. If you want to obtain result with dropouts, set this parameter
|
||||
to `True`. Also, the parameter is set to true when obtaining prediction for
|
||||
custom objective function.
|
||||
|
||||
.. versionadded:: 1.0.0
|
||||
|
||||
@ -1686,12 +1688,6 @@ class Booster(object):
|
||||
|
||||
.. versionadded:: 1.4.0
|
||||
|
||||
.. note:: Using ``predict()`` with DART booster
|
||||
|
||||
If the booster object is DART type, ``predict()`` will not perform
|
||||
dropouts, i.e. all the trees will be evaluated. If you want to
|
||||
obtain result with dropouts, provide `training=True`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
prediction : numpy array
|
||||
@ -1916,11 +1912,9 @@ class Booster(object):
|
||||
The model is saved in an XGBoost internal format which is universal among the
|
||||
various XGBoost interfaces. Auxiliary attributes of the Python Booster object
|
||||
(such as feature_names) will not be saved when using binary format. To save those
|
||||
attributes, use JSON instead. See:
|
||||
|
||||
https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html
|
||||
|
||||
for more info.
|
||||
attributes, use JSON instead. See: `Model IO
|
||||
<https://xgboost.readthedocs.io/en/stable/tutorials/saving_model.html>`_ for more
|
||||
info.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
@ -1956,11 +1950,9 @@ class Booster(object):
|
||||
The model is loaded from XGBoost format which is universal among the various
|
||||
XGBoost interfaces. Auxiliary attributes of the Python Booster object (such as
|
||||
feature_names) will not be loaded when using binary format. To save those
|
||||
attributes, use JSON instead. See:
|
||||
|
||||
https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html
|
||||
|
||||
for more info.
|
||||
attributes, use JSON instead. See: `Model IO
|
||||
<https://xgboost.readthedocs.io/en/stable/tutorials/saving_model.html>`_ for more
|
||||
info.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user