[breaking] Remove the predictor param, allow fallback to prediction using DMatrix. (#9129)
- A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter. - The `predictor` parameter is removed. - Fallback to `DMatrix` when `inplace_predict` is not available. - The heuristic for choosing a predictor is only used during training.
This commit is contained in:
@@ -45,7 +45,7 @@ XGBoost makes use of `GPUTreeShap <https://github.com/rapidsai/gputreeshap>`_ as
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
model.set_param({"predictor": "gpu_predictor"})
|
||||
model.set_param({"gpu_id": "0", "tree_method": "gpu_hist"})
|
||||
shap_values = model.predict(dtrain, pred_contribs=True)
|
||||
shap_interaction_values = model.predict(dtrain, pred_interactions=True)
|
||||
|
||||
|
||||
@@ -199,18 +199,6 @@ Parameters for Tree Booster
|
||||
- Maximum number of discrete bins to bucket continuous features.
|
||||
- Increasing this number improves the optimality of splits at the cost of higher computation time.
|
||||
|
||||
* ``predictor``, [default= ``auto``]
|
||||
|
||||
- The type of predictor algorithm to use. Provides the same results but allows the use of GPU or CPU.
|
||||
|
||||
- ``auto``: Configure predictor based on heuristics.
|
||||
- ``cpu_predictor``: Multicore CPU prediction algorithm.
|
||||
- ``gpu_predictor``: Prediction using GPU. Used when ``tree_method`` is ``gpu_hist``.
|
||||
When ``predictor`` is set to default value ``auto``, the ``gpu_hist`` tree method is
|
||||
able to provide GPU based prediction without copying training data to GPU memory.
|
||||
If ``gpu_predictor`` is explicitly specified, then all data is copied into GPU, only
|
||||
recommended for performing prediction tasks.
|
||||
|
||||
* ``num_parallel_tree``, [default=1]
|
||||
|
||||
- Number of parallel trees constructed during each iteration. This option is used to support boosted random forest.
|
||||
|
||||
@@ -87,15 +87,6 @@ with the native Python interface :py:meth:`xgboost.Booster.predict` and
|
||||
behavior. Also the ``save_best`` parameter from :py:obj:`xgboost.callback.EarlyStopping`
|
||||
might be useful.
|
||||
|
||||
*********
|
||||
Predictor
|
||||
*********
|
||||
|
||||
There are 2 predictors in XGBoost (3 if you have the one-api plugin enabled), namely
|
||||
``cpu_predictor`` and ``gpu_predictor``. The default option is ``auto`` so that XGBoost
|
||||
can employ some heuristics for saving GPU memory during training. They might have slight
|
||||
different outputs due to floating point errors.
|
||||
|
||||
|
||||
***********
|
||||
Base Margin
|
||||
@@ -134,15 +125,6 @@ it. Be aware that the output of in-place prediction depends on input data type,
|
||||
input is on GPU data output is :py:obj:`cupy.ndarray`, otherwise a :py:obj:`numpy.ndarray`
|
||||
is returned.
|
||||
|
||||
****************
|
||||
Categorical Data
|
||||
****************
|
||||
|
||||
Other than users performing encoding, XGBoost has experimental support for categorical
|
||||
data using ``gpu_hist`` and ``gpu_predictor``. No special operation needs to be done on
|
||||
input test data since the information about categories is encoded into the model during
|
||||
training.
|
||||
|
||||
*************
|
||||
Thread Safety
|
||||
*************
|
||||
@@ -159,7 +141,6 @@ instance we might accidentally call ``clf.set_params()`` inside a predict functi
|
||||
|
||||
def predict_fn(clf: xgb.XGBClassifier, X):
|
||||
X = preprocess(X)
|
||||
clf.set_params(predictor="gpu_predictor") # NOT safe!
|
||||
clf.set_params(n_jobs=1) # NOT safe!
|
||||
return clf.predict_proba(X, iteration_range=(0, 10))
|
||||
|
||||
|
||||
@@ -148,8 +148,8 @@ Also for inplace prediction:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
booster.set_param({'predictor': 'gpu_predictor'})
|
||||
# where X is a dask DataFrame or dask Array containing cupy or cuDF backed data.
|
||||
# where X is a dask DataFrame or dask Array backed by cupy or cuDF.
|
||||
booster.set_param({"gpu_id": "0"})
|
||||
prediction = xgb.dask.inplace_predict(client, booster, X)
|
||||
|
||||
When input is ``da.Array`` object, output is always ``da.Array``. However, if the input
|
||||
|
||||
@@ -173,7 +173,6 @@ Will print out something similar to (not actual output as it's too long for demo
|
||||
"gradient_booster": {
|
||||
"gbtree_train_param": {
|
||||
"num_parallel_tree": "1",
|
||||
"predictor": "gpu_predictor",
|
||||
"process_type": "default",
|
||||
"tree_method": "gpu_hist",
|
||||
"updater": "grow_gpu_hist",
|
||||
|
||||
Reference in New Issue
Block a user