Handle the new device parameter in dask and demos. (#9386)
* Handle the new `device` parameter in dask and demos. - Check no ordinal is specified in the dask interface. - Update demos. - Update dask doc. - Update the condition for QDM.
This commit is contained in:
@@ -14,30 +14,24 @@ Most of the algorithms in XGBoost including training, prediction and evaluation
|
||||
|
||||
Usage
|
||||
=====
|
||||
Specify the ``tree_method`` parameter as ``gpu_hist``. For details around the ``tree_method`` parameter, see :doc:`tree method </treemethod>`.
|
||||
|
||||
Supported parameters
|
||||
--------------------
|
||||
|
||||
GPU accelerated prediction is enabled by default for the above mentioned ``tree_method`` parameters but can be switched to CPU prediction by setting ``predictor`` to ``cpu_predictor``. This could be useful if you want to conserve GPU memory. Likewise when using CPU algorithms, GPU accelerated prediction can be enabled by setting ``predictor`` to ``gpu_predictor``.
|
||||
|
||||
The device ordinal (which GPU to use if you have many of them) can be selected using the
|
||||
``device`` parameter, which defaults to 0 when "CUDA" is specified(the first device reported by CUDA
|
||||
runtime).
|
||||
|
||||
To enable GPU acceleration, specify the ``device`` parameter as ``cuda``. In addition, the device ordinal (which GPU to use if you have multiple devices in the same node) can be specified using the ``cuda:<ordinal>`` syntax, where ``<ordinal>`` is an integer that represents the device ordinal. XGBoost defaults to 0 (the first device reported by CUDA runtime).
|
||||
|
||||
The GPU algorithms currently work with CLI, Python, R, and JVM packages. See :doc:`/install` for details.
|
||||
|
||||
.. code-block:: python
|
||||
:caption: Python example
|
||||
|
||||
param["device"] = "cuda:0"
|
||||
param['tree_method'] = 'gpu_hist'
|
||||
params = dict()
|
||||
params["device"] = "cuda:0"
|
||||
params["tree_method"] = "hist"
|
||||
Xy = xgboost.QuantileDMatrix(X, y)
|
||||
xgboost.train(params, Xy)
|
||||
|
||||
.. code-block:: python
|
||||
:caption: With Scikit-Learn interface
|
||||
|
||||
XGBRegressor(tree_method='gpu_hist', device="cuda")
|
||||
XGBRegressor(tree_method="hist", device="cuda")
|
||||
|
||||
|
||||
GPU-Accelerated SHAP values
|
||||
@@ -46,12 +40,11 @@ XGBoost makes use of `GPUTreeShap <https://github.com/rapidsai/gputreeshap>`_ as
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
model.set_param({"device": "cuda:0", "tree_method": "gpu_hist"})
|
||||
shap_values = model.predict(dtrain, pred_contribs=True)
|
||||
booster.set_param({"device": "cuda:0"})
|
||||
shap_values = booster.predict(dtrain, pred_contribs=True)
|
||||
shap_interaction_values = model.predict(dtrain, pred_interactions=True)
|
||||
|
||||
See examples `here
|
||||
<https://github.com/dmlc/xgboost/tree/master/demo/gpu_acceleration>`__.
|
||||
See examples `here <https://github.com/dmlc/xgboost/tree/master/demo/gpu_acceleration>`__.
|
||||
|
||||
Multi-node Multi-GPU Training
|
||||
=============================
|
||||
@@ -61,7 +54,7 @@ XGBoost supports fully distributed GPU training using `Dask <https://dask.org/>`
|
||||
|
||||
Memory usage
|
||||
============
|
||||
The following are some guidelines on the device memory usage of the `gpu_hist` tree method.
|
||||
The following are some guidelines on the device memory usage of the ``hist`` tree method on GPU.
|
||||
|
||||
Memory inside xgboost training is generally allocated for two reasons - storing the dataset and working memory.
|
||||
|
||||
@@ -79,7 +72,7 @@ XGBoost models trained on GPUs can be used on CPU-only systems to generate predi
|
||||
|
||||
Developer notes
|
||||
===============
|
||||
The application may be profiled with annotations by specifying USE_NTVX to cmake. Regions covered by the 'Monitor' class in CUDA code will automatically appear in the nsight profiler when `verbosity` is set to 3.
|
||||
The application may be profiled with annotations by specifying ``USE_NTVX`` to cmake. Regions covered by the 'Monitor' class in CUDA code will automatically appear in the nsight profiler when `verbosity` is set to 3.
|
||||
|
||||
**********
|
||||
References
|
||||
|
||||
@@ -55,10 +55,6 @@ General Parameters
|
||||
|
||||
- Flag to disable default metric. Set to 1 or ``true`` to disable.
|
||||
|
||||
* ``num_feature`` [set automatically by XGBoost, no need to be set by user]
|
||||
|
||||
- Feature dimension used in boosting, set to maximum dimension of the feature
|
||||
|
||||
* ``device`` [default= ``cpu``]
|
||||
|
||||
.. versionadded:: 2.0.0
|
||||
@@ -164,7 +160,7 @@ Parameters for Tree Booster
|
||||
- ``grow_colmaker``: non-distributed column-based construction of trees.
|
||||
- ``grow_histmaker``: distributed tree construction with row-based data splitting based on global proposal of histogram counting.
|
||||
- ``grow_quantile_histmaker``: Grow tree using quantized histogram.
|
||||
- ``grow_gpu_hist``: Grow tree with GPU. Same as setting tree method to ``hist`` and use ``device=cuda``.
|
||||
- ``grow_gpu_hist``: Grow tree with GPU. Same as setting ``tree_method`` to ``hist`` and use ``device=cuda``.
|
||||
- ``sync``: synchronizes trees in all distributed nodes.
|
||||
- ``refresh``: refreshes tree's statistics and/or leaf values based on the current data. Note that no random subsampling of data rows is performed.
|
||||
- ``prune``: prunes the splits where loss < min_split_loss (or gamma) and nodes that have depth greater than ``max_depth``.
|
||||
@@ -421,7 +417,7 @@ Specify the learning task and the corresponding learning objective. The objectiv
|
||||
|
||||
.. math::
|
||||
|
||||
AP@l = \frac{1}{min{(l, N)}}\sum^l_{k=1}P@k \cdot I_{(k)}
|
||||
AP@l = \frac{1}{min{(l, N)}}\sum^l_{k=1}P@k \cdot I_{(k)}
|
||||
|
||||
where :math:`I_{(k)}` is an indicator function that equals to :math:`1` when the document at :math:`k` is relevant and :math:`0` otherwise. The :math:`P@k` is the precision at :math:`k`, and :math:`N` is the total number of relevant documents. Lastly, the `mean average precision` is defined as the weighted average across all queries.
|
||||
|
||||
|
||||
@@ -310,8 +310,8 @@ for more info.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Use "gpu_hist" for training the model.
|
||||
reg = xgb.XGBRegressor(tree_method="gpu_hist")
|
||||
# Use "hist" for training the model.
|
||||
reg = xgb.XGBRegressor(tree_method="hist", device="cuda")
|
||||
# Fit the model using predictor X and response y.
|
||||
reg.fit(X, y)
|
||||
# Save model into JSON format.
|
||||
|
||||
@@ -56,7 +56,6 @@ on a dask cluster:
|
||||
dtrain = xgb.dask.DaskDMatrix(client, X, y)
|
||||
# or
|
||||
# dtrain = xgb.dask.DaskQuantileDMatrix(client, X, y)
|
||||
# `DaskQuantileDMatrix` is available for the `hist` and `gpu_hist` tree method.
|
||||
|
||||
output = xgb.dask.train(
|
||||
client,
|
||||
@@ -149,7 +148,7 @@ Also for inplace prediction:
|
||||
.. code-block:: python
|
||||
|
||||
# where X is a dask DataFrame or dask Array backed by cupy or cuDF.
|
||||
booster.set_param({"device": "cuda:0"})
|
||||
booster.set_param({"device": "cuda"})
|
||||
prediction = xgb.dask.inplace_predict(client, booster, X)
|
||||
|
||||
When input is ``da.Array`` object, output is always ``da.Array``. However, if the input
|
||||
@@ -225,6 +224,12 @@ collection.
|
||||
main(client)
|
||||
|
||||
|
||||
****************
|
||||
GPU acceleration
|
||||
****************
|
||||
|
||||
For most of the use cases with GPUs, the `Dask-CUDA <https://docs.rapids.ai/api/dask-cuda/stable/quickstart.html>`__ project should be used to create the cluster, which automatically configures the correct device ordinal for worker processes. As a result, users should NOT specify the ordinal (good: ``device=cuda``, bad: ``device=cuda:1``). See :ref:`sphx_glr_python_dask-examples_gpu_training.py` and :ref:`sphx_glr_python_dask-examples_sklearn_gpu_training.py` for worked examples.
|
||||
|
||||
***************************
|
||||
Working with other clusters
|
||||
***************************
|
||||
@@ -262,7 +267,7 @@ In the example below, a ``KubeCluster`` is used for `deploying Dask on Kubernete
|
||||
|
||||
regressor = xgb.dask.DaskXGBRegressor(n_estimators=10, missing=0.0)
|
||||
regressor.client = client
|
||||
regressor.set_params(tree_method='gpu_hist')
|
||||
regressor.set_params(tree_method='hist', device="cuda")
|
||||
regressor.fit(X, y, eval_set=[(X, y)])
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user