Define the new device parameter. (#9362)
This commit is contained in:
@@ -22,7 +22,8 @@ Supported parameters
|
||||
GPU accelerated prediction is enabled by default for the above mentioned ``tree_method`` parameters but can be switched to CPU prediction by setting ``predictor`` to ``cpu_predictor``. This could be useful if you want to conserve GPU memory. Likewise when using CPU algorithms, GPU accelerated prediction can be enabled by setting ``predictor`` to ``gpu_predictor``.
|
||||
|
||||
The device ordinal (which GPU to use if you have many of them) can be selected using the
|
||||
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
|
||||
``device`` parameter, which defaults to 0 when "CUDA" is specified(the first device reported by CUDA
|
||||
runtime).
|
||||
|
||||
|
||||
The GPU algorithms currently work with CLI, Python, R, and JVM packages. See :doc:`/install` for details.
|
||||
@@ -30,13 +31,13 @@ The GPU algorithms currently work with CLI, Python, R, and JVM packages. See :do
|
||||
.. code-block:: python
|
||||
:caption: Python example
|
||||
|
||||
param['gpu_id'] = 0
|
||||
param["device"] = "cuda:0"
|
||||
param['tree_method'] = 'gpu_hist'
|
||||
|
||||
.. code-block:: python
|
||||
:caption: With Scikit-Learn interface
|
||||
|
||||
XGBRegressor(tree_method='gpu_hist', gpu_id=0)
|
||||
XGBRegressor(tree_method='gpu_hist', device="cuda")
|
||||
|
||||
|
||||
GPU-Accelerated SHAP values
|
||||
@@ -45,7 +46,7 @@ XGBoost makes use of `GPUTreeShap <https://github.com/rapidsai/gputreeshap>`_ as
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
model.set_param({"gpu_id": "0", "tree_method": "gpu_hist"})
|
||||
model.set_param({"device": "cuda:0", "tree_method": "gpu_hist"})
|
||||
shap_values = model.predict(dtrain, pred_contribs=True)
|
||||
shap_interaction_values = model.predict(dtrain, pred_interactions=True)
|
||||
|
||||
|
||||
@@ -3,10 +3,10 @@ Installation Guide
|
||||
##################
|
||||
|
||||
XGBoost provides binary packages for some language bindings. The binary packages support
|
||||
the GPU algorithm (``gpu_hist``) on machines with NVIDIA GPUs. Please note that **training
|
||||
with multiple GPUs is only supported for Linux platform**. See :doc:`gpu/index`. Also we
|
||||
have both stable releases and nightly builds, see below for how to install them. For
|
||||
building from source, visit :doc:`this page </build>`.
|
||||
the GPU algorithm (``device=cuda:0``) on machines with NVIDIA GPUs. Please note that
|
||||
**training with multiple GPUs is only supported for Linux platform**. See
|
||||
:doc:`gpu/index`. Also we have both stable releases and nightly builds, see below for how
|
||||
to install them. For building from source, visit :doc:`this page </build>`.
|
||||
|
||||
.. contents:: Contents
|
||||
|
||||
|
||||
@@ -59,6 +59,18 @@ General Parameters
|
||||
|
||||
- Feature dimension used in boosting, set to maximum dimension of the feature
|
||||
|
||||
* ``device`` [default= ``cpu``]
|
||||
|
||||
.. versionadded:: 2.0.0
|
||||
|
||||
- Device for XGBoost to run. User can set it to one of the following values:
|
||||
|
||||
+ ``cpu``: Use CPU.
|
||||
+ ``cuda``: Use a GPU (CUDA device).
|
||||
+ ``cuda:<ordinal>``: ``<ordinal>`` is an integer that specifies the ordinal of the GPU (which GPU do you want to use if you have more than one devices).
|
||||
+ ``gpu``: Default GPU device selection from the list of available and supported devices. Only ``cuda`` devices are supported currently.
|
||||
+ ``gpu:<ordinal>``: Default GPU device selection from the list of available and supported devices. Only ``cuda`` devices are supported currently.
|
||||
|
||||
Parameters for Tree Booster
|
||||
===========================
|
||||
* ``eta`` [default=0.3, alias: ``learning_rate``]
|
||||
@@ -99,7 +111,7 @@ Parameters for Tree Booster
|
||||
- ``gradient_based``: the selection probability for each training instance is proportional to the
|
||||
*regularized absolute value* of gradients (more specifically, :math:`\sqrt{g^2+\lambda h^2}`).
|
||||
``subsample`` may be set to as low as 0.1 without loss of model accuracy. Note that this
|
||||
sampling method is only supported when ``tree_method`` is set to ``gpu_hist``; other tree
|
||||
sampling method is only supported when ``tree_method`` is set to ``hist`` and the device is ``cuda``; other tree
|
||||
methods only support ``uniform`` sampling.
|
||||
|
||||
* ``colsample_bytree``, ``colsample_bylevel``, ``colsample_bynode`` [default=1]
|
||||
@@ -131,26 +143,15 @@ Parameters for Tree Booster
|
||||
* ``tree_method`` string [default= ``auto``]
|
||||
|
||||
- The tree construction algorithm used in XGBoost. See description in the `reference paper <http://arxiv.org/abs/1603.02754>`_ and :doc:`treemethod`.
|
||||
- XGBoost supports ``approx``, ``hist`` and ``gpu_hist`` for distributed training. Experimental support for external memory is available for ``approx`` and ``gpu_hist``.
|
||||
|
||||
- Choices: ``auto``, ``exact``, ``approx``, ``hist``, ``gpu_hist``, this is a
|
||||
combination of commonly used updaters. For other updaters like ``refresh``, set the
|
||||
parameter ``updater`` directly.
|
||||
- Choices: ``auto``, ``exact``, ``approx``, ``hist``, this is a combination of commonly
|
||||
used updaters. For other updaters like ``refresh``, set the parameter ``updater``
|
||||
directly.
|
||||
|
||||
- ``auto``: Use heuristic to choose the fastest method.
|
||||
|
||||
- For small dataset, exact greedy (``exact``) will be used.
|
||||
- For larger dataset, approximate algorithm (``approx``) will be chosen. It's
|
||||
recommended to try ``hist`` and ``gpu_hist`` for higher performance with large
|
||||
dataset.
|
||||
(``gpu_hist``)has support for ``external memory``.
|
||||
|
||||
- Because old behavior is always use exact greedy in single machine, user will get a
|
||||
message when approximate algorithm is chosen to notify this choice.
|
||||
- ``auto``: Same as the ``hist`` tree method.
|
||||
- ``exact``: Exact greedy algorithm. Enumerates all split candidates.
|
||||
- ``approx``: Approximate greedy algorithm using quantile sketch and gradient histogram.
|
||||
- ``hist``: Faster histogram optimized approximate greedy algorithm.
|
||||
- ``gpu_hist``: GPU implementation of ``hist`` algorithm.
|
||||
|
||||
* ``scale_pos_weight`` [default=1]
|
||||
|
||||
@@ -163,7 +164,7 @@ Parameters for Tree Booster
|
||||
- ``grow_colmaker``: non-distributed column-based construction of trees.
|
||||
- ``grow_histmaker``: distributed tree construction with row-based data splitting based on global proposal of histogram counting.
|
||||
- ``grow_quantile_histmaker``: Grow tree using quantized histogram.
|
||||
- ``grow_gpu_hist``: Grow tree with GPU.
|
||||
- ``grow_gpu_hist``: Grow tree with GPU. Same as setting tree method to ``hist`` and use ``device=cuda``.
|
||||
- ``sync``: synchronizes trees in all distributed nodes.
|
||||
- ``refresh``: refreshes tree's statistics and/or leaf values based on the current data. Note that no random subsampling of data rows is performed.
|
||||
- ``prune``: prunes the splits where loss < min_split_loss (or gamma) and nodes that have depth greater than ``max_depth``.
|
||||
@@ -183,7 +184,7 @@ Parameters for Tree Booster
|
||||
* ``grow_policy`` [default= ``depthwise``]
|
||||
|
||||
- Controls a way new nodes are added to the tree.
|
||||
- Currently supported only if ``tree_method`` is set to ``hist``, ``approx`` or ``gpu_hist``.
|
||||
- Currently supported only if ``tree_method`` is set to ``hist`` or ``approx``.
|
||||
- Choices: ``depthwise``, ``lossguide``
|
||||
|
||||
- ``depthwise``: split at nodes closest to the root.
|
||||
@@ -195,7 +196,7 @@ Parameters for Tree Booster
|
||||
|
||||
* ``max_bin``, [default=256]
|
||||
|
||||
- Only used if ``tree_method`` is set to ``hist``, ``approx`` or ``gpu_hist``.
|
||||
- Only used if ``tree_method`` is set to ``hist`` or ``approx``.
|
||||
- Maximum number of discrete bins to bucket continuous features.
|
||||
- Increasing this number improves the optimality of splits at the cost of higher computation time.
|
||||
|
||||
|
||||
@@ -3,14 +3,14 @@ Tree Methods
|
||||
############
|
||||
|
||||
For training boosted tree models, there are 2 parameters used for choosing algorithms,
|
||||
namely ``updater`` and ``tree_method``. XGBoost has 4 builtin tree methods, namely
|
||||
``exact``, ``approx``, ``hist`` and ``gpu_hist``. Along with these tree methods, there
|
||||
are also some free standing updaters including ``refresh``,
|
||||
``prune`` and ``sync``. The parameter ``updater`` is more primitive than ``tree_method``
|
||||
as the latter is just a pre-configuration of the former. The difference is mostly due to
|
||||
historical reasons that each updater requires some specific configurations and might has
|
||||
missing features. As we are moving forward, the gap between them is becoming more and
|
||||
more irrelevant. We will collectively document them under tree methods.
|
||||
namely ``updater`` and ``tree_method``. XGBoost has 3 builtin tree methods, namely
|
||||
``exact``, ``approx`` and ``hist``. Along with these tree methods, there are also some
|
||||
free standing updaters including ``refresh``, ``prune`` and ``sync``. The parameter
|
||||
``updater`` is more primitive than ``tree_method`` as the latter is just a
|
||||
pre-configuration of the former. The difference is mostly due to historical reasons that
|
||||
each updater requires some specific configurations and might has missing features. As we
|
||||
are moving forward, the gap between them is becoming more and more irrelevant. We will
|
||||
collectively document them under tree methods.
|
||||
|
||||
**************
|
||||
Exact Solution
|
||||
@@ -19,23 +19,23 @@ Exact Solution
|
||||
Exact means XGBoost considers all candidates from data for tree splitting, but underlying
|
||||
the objective is still interpreted as a Taylor expansion.
|
||||
|
||||
1. ``exact``: Vanilla gradient boosting tree algorithm described in `reference paper
|
||||
<http://arxiv.org/abs/1603.02754>`_. During each split finding procedure, it iterates
|
||||
over all entries of input data. It's more accurate (among other greedy methods) but
|
||||
slow in computation performance. Also it doesn't support distributed training as
|
||||
XGBoost employs row spliting data distribution while ``exact`` tree method works on a
|
||||
sorted column format. This tree method can be used with parameter ``tree_method`` set
|
||||
to ``exact``.
|
||||
1. ``exact``: The vanilla gradient boosting tree algorithm described in `reference paper
|
||||
<http://arxiv.org/abs/1603.02754>`_. During split-finding, it iterates over all
|
||||
entries of input data. It's more accurate (among other greedy methods) but
|
||||
computationally slower in compared to other tree methods. Further more, its feature
|
||||
set is limited. Features like distributed training and external memory that require
|
||||
approximated quantiles are not supported. This tree method can be used with the
|
||||
parameter ``tree_method`` set to ``exact``.
|
||||
|
||||
|
||||
**********************
|
||||
Approximated Solutions
|
||||
**********************
|
||||
|
||||
As ``exact`` tree method is slow in performance and not scalable, we often employ
|
||||
approximated training algorithms. These algorithms build a gradient histogram for each
|
||||
node and iterate through the histogram instead of real dataset. Here we introduce the
|
||||
implementations in XGBoost below.
|
||||
As ``exact`` tree method is slow in computation performance and difficult to scale, we
|
||||
often employ approximated training algorithms. These algorithms build a gradient
|
||||
histogram for each node and iterate through the histogram instead of real dataset. Here
|
||||
we introduce the implementations in XGBoost.
|
||||
|
||||
1. ``approx`` tree method: An approximation tree method described in `reference paper
|
||||
<http://arxiv.org/abs/1603.02754>`_. It runs sketching before building each tree
|
||||
@@ -48,22 +48,18 @@ implementations in XGBoost below.
|
||||
this global sketch. This is the fastest algorithm as it runs sketching only once. The
|
||||
algorithm can be accessed by setting ``tree_method`` to ``hist``.
|
||||
|
||||
3. ``gpu_hist`` tree method: The ``gpu_hist`` tree method is a GPU implementation of
|
||||
``hist``, with additional support for gradient based sampling. The algorithm can be
|
||||
accessed by setting ``tree_method`` to ``gpu_hist``.
|
||||
|
||||
************
|
||||
Implications
|
||||
************
|
||||
|
||||
Some objectives like ``reg:squarederror`` have constant hessian. In this case, ``hist``
|
||||
or ``gpu_hist`` should be preferred as weighted sketching doesn't make sense with constant
|
||||
Some objectives like ``reg:squarederror`` have constant hessian. In this case, the
|
||||
``hist`` should be preferred as weighted sketching doesn't make sense with constant
|
||||
weights. When using non-constant hessian objectives, sometimes ``approx`` yields better
|
||||
accuracy, but with slower computation performance. Most of the time using ``(gpu)_hist``
|
||||
with higher ``max_bin`` can achieve similar or even superior accuracy while maintaining
|
||||
good performance. However, as xgboost is largely driven by community effort, the actual
|
||||
implementations have some differences than pure math description. Result might have
|
||||
slight differences than expectation, which we are currently trying to overcome.
|
||||
accuracy, but with slower computation performance. Most of the time using ``hist`` with
|
||||
higher ``max_bin`` can achieve similar or even superior accuracy while maintaining good
|
||||
performance. However, as xgboost is largely driven by community effort, the actual
|
||||
implementations have some differences than pure math description. Result might be
|
||||
slightly different than expectation, which we are currently trying to overcome.
|
||||
|
||||
**************
|
||||
Other Updaters
|
||||
@@ -106,8 +102,8 @@ solely for the interest of documentation.
|
||||
histogram creation step and uses sketching values directly during split evaluation. It
|
||||
was never tested and contained some unknown bugs, we decided to remove it and focus our
|
||||
resources on more promising algorithms instead. For accuracy, most of the time
|
||||
``approx``, ``hist`` and ``gpu_hist`` are enough with some parameters tuning, so
|
||||
removing them don't have any real practical impact.
|
||||
``approx`` and ``hist`` are enough with some parameters tuning, so removing them don't
|
||||
have any real practical impact.
|
||||
|
||||
3. ``grow_local_histmaker`` updater: An approximation tree method described in `reference
|
||||
paper <http://arxiv.org/abs/1603.02754>`_. This updater was rarely used in practice so
|
||||
|
||||
@@ -149,7 +149,7 @@ Also for inplace prediction:
|
||||
.. code-block:: python
|
||||
|
||||
# where X is a dask DataFrame or dask Array backed by cupy or cuDF.
|
||||
booster.set_param({"gpu_id": "0"})
|
||||
booster.set_param({"device": "cuda:0"})
|
||||
prediction = xgb.dask.inplace_predict(client, booster, X)
|
||||
|
||||
When input is ``da.Array`` object, output is always ``da.Array``. However, if the input
|
||||
|
||||
@@ -163,7 +163,7 @@ Will print out something similar to (not actual output as it's too long for demo
|
||||
{
|
||||
"Learner": {
|
||||
"generic_parameter": {
|
||||
"gpu_id": "0",
|
||||
"device": "cuda:0",
|
||||
"gpu_page_size": "0",
|
||||
"n_jobs": "0",
|
||||
"random_state": "0",
|
||||
|
||||
Reference in New Issue
Block a user