Update doc for feature constraints and n_gpus. (#4596)
* Update doc for feature constraints. * Fix some warnings. * Clean up doc for `n_gpus`.
This commit is contained in:
@@ -50,7 +50,7 @@ Supported parameters
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``gpu_id`` | |tick| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``n_gpus`` | |cross| | |tick| |
|
||||
| ``n_gpus`` (deprecated) | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``predictor`` | |tick| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
@@ -58,6 +58,8 @@ Supported parameters
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``monotone_constraints`` | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``interaction_constraints`` | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``single_precision_histogram`` | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
|
||||
@@ -65,7 +67,8 @@ GPU accelerated prediction is enabled by default for the above mentioned ``tree_
|
||||
|
||||
The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures.
|
||||
|
||||
The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.
|
||||
The device ordinal (which GPU to use if you have many of them) can be selected using the
|
||||
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
|
||||
|
||||
|
||||
The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
|
||||
@@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
|
||||
|
||||
Single Node Multi-GPU
|
||||
=====================
|
||||
.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU.
|
||||
|
||||
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
|
||||
|
||||
.. note:: Enabling multi-GPU training
|
||||
|
||||
Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
|
||||
XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.
|
||||
|
||||
.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90. Please use distributed GPU training with one process per GPU.
|
||||
|
||||
Multi-node Multi-GPU Training
|
||||
=============================
|
||||
@@ -101,66 +96,64 @@ Objective functions
|
||||
===================
|
||||
Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status.
|
||||
|
||||
.. |tick| unicode:: U+2714
|
||||
.. |cross| unicode:: U+2718
|
||||
+--------------------+-------------+
|
||||
| Objectives | GPU support |
|
||||
+--------------------+-------------+
|
||||
| reg:squarederror | |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:squaredlogerror| |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:logistic | |tick| |
|
||||
+--------------------+-------------+
|
||||
| binary:logistic | |tick| |
|
||||
+--------------------+-------------+
|
||||
| binary:logitraw | |tick| |
|
||||
+--------------------+-------------+
|
||||
| binary:hinge | |tick| |
|
||||
+--------------------+-------------+
|
||||
| count:poisson | |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:gamma | |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:tweedie | |tick| |
|
||||
+--------------------+-------------+
|
||||
| multi:softmax | |tick| |
|
||||
+--------------------+-------------+
|
||||
| multi:softprob | |tick| |
|
||||
+--------------------+-------------+
|
||||
| survival:cox | |cross| |
|
||||
+--------------------+-------------+
|
||||
| rank:pairwise | |cross| |
|
||||
+--------------------+-------------+
|
||||
| rank:ndcg | |cross| |
|
||||
+--------------------+-------------+
|
||||
| rank:map | |cross| |
|
||||
+--------------------+-------------+
|
||||
|
||||
+-----------------+-------------+
|
||||
| Objectives | GPU support |
|
||||
+-----------------+-------------+
|
||||
| reg:squarederror| |tick| |
|
||||
+-----------------+-------------+
|
||||
| reg:logistic | |tick| |
|
||||
+-----------------+-------------+
|
||||
| binary:logistic | |tick| |
|
||||
+-----------------+-------------+
|
||||
| binary:logitraw | |tick| |
|
||||
+-----------------+-------------+
|
||||
| binary:hinge | |tick| |
|
||||
+-----------------+-------------+
|
||||
| count:poisson | |tick| |
|
||||
+-----------------+-------------+
|
||||
| reg:gamma | |tick| |
|
||||
+-----------------+-------------+
|
||||
| reg:tweedie | |tick| |
|
||||
+-----------------+-------------+
|
||||
| multi:softmax | |tick| |
|
||||
+-----------------+-------------+
|
||||
| multi:softprob | |tick| |
|
||||
+-----------------+-------------+
|
||||
| survival:cox | |cross| |
|
||||
+-----------------+-------------+
|
||||
| rank:pairwise | |cross| |
|
||||
+-----------------+-------------+
|
||||
| rank:ndcg | |cross| |
|
||||
+-----------------+-------------+
|
||||
| rank:map | |cross| |
|
||||
+-----------------+-------------+
|
||||
|
||||
For multi-gpu support, objective functions also honor the ``n_gpus`` parameter,
|
||||
which, by default is set to 1. To disable running objectives on GPU, just set
|
||||
``n_gpus`` to 0.
|
||||
Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by
|
||||
default. For unsupported objectives XGBoost will fall back to using CPU implementation by
|
||||
default.
|
||||
|
||||
Metric functions
|
||||
===================
|
||||
Following table shows current support status for evaluation metrics on the GPU.
|
||||
|
||||
.. |tick| unicode:: U+2714
|
||||
.. |cross| unicode:: U+2718
|
||||
|
||||
+-----------------+-------------+
|
||||
| Metric | GPU Support |
|
||||
+=================+=============+
|
||||
| rmse | |tick| |
|
||||
+-----------------+-------------+
|
||||
| rmsle | |tick| |
|
||||
+-----------------+-------------+
|
||||
| mae | |tick| |
|
||||
+-----------------+-------------+
|
||||
| logloss | |tick| |
|
||||
+-----------------+-------------+
|
||||
| error | |tick| |
|
||||
+-----------------+-------------+
|
||||
| merror | |cross| |
|
||||
| merror | |tick| |
|
||||
+-----------------+-------------+
|
||||
| mlogloss | |cross| |
|
||||
| mlogloss | |tick| |
|
||||
+-----------------+-------------+
|
||||
| auc | |cross| |
|
||||
+-----------------+-------------+
|
||||
@@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU.
|
||||
| tweedie-nloglik | |tick| |
|
||||
+-----------------+-------------+
|
||||
|
||||
As for objective functions, metrics honor the ``n_gpus`` parameter,
|
||||
which, by default is set to 1. To disable running metrics on GPU, just set
|
||||
``n_gpus`` to 0.
|
||||
|
||||
Similar to objective functions, default device for metrics is selected based on tree
|
||||
updater and predictor (which is selected based on tree updater).
|
||||
|
||||
Benchmarks
|
||||
==========
|
||||
|
||||
@@ -171,7 +171,107 @@ parameter:
|
||||
num_boost_round = 1000, evals = evallist,
|
||||
early_stopping_rounds = 10)
|
||||
|
||||
**Choice of tree construction algorithm**. To use feature interaction
|
||||
constraints, be sure to set the ``tree_method`` parameter to either ``exact``
|
||||
or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not
|
||||
support feature interaction constraints.
|
||||
**Choice of tree construction algorithm**. To use feature interaction constraints, be sure
|
||||
to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or
|
||||
``gpu_hist``. Support for ``gpu_hist`` is added after (excluding) version 0.90.
|
||||
|
||||
|
||||
**************
|
||||
Advanced topic
|
||||
**************
|
||||
|
||||
The intuition behind interaction constraint is simple. User have prior knowledge about
|
||||
relations between different features, and encode it as constraints during model
|
||||
construction. But there are also some subtleties around specifying constraints. Take
|
||||
constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two
|
||||
different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features
|
||||
allowed to interact with ``2`` is ``{1, 3, 4}``. In following diagram, root splits at
|
||||
feature ``2``. because all its descendants should be able to interact with it, so at the
|
||||
second layer all 4 features are legitimate split candidates for further splitting,
|
||||
disregarding specified constraint sets.
|
||||
|
||||
.. plot::
|
||||
:nofigs:
|
||||
|
||||
from graphviz import Source
|
||||
source = r"""
|
||||
digraph feature_interaction_illustration4 {
|
||||
graph [fontname = "helvetica"];
|
||||
node [fontname = "helvetica"];
|
||||
edge [fontname = "helvetica"];
|
||||
0 [label=<x<SUB><FONT POINT-SIZE="11">2</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
1 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box];
|
||||
2 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
3 [label="...", shape=none];
|
||||
4 [label="...", shape=none];
|
||||
5 [label="...", shape=none];
|
||||
6 [label="...", shape=none];
|
||||
0 -> 1;
|
||||
0 -> 2;
|
||||
1 -> 3;
|
||||
1 -> 4;
|
||||
2 -> 5;
|
||||
2 -> 6;
|
||||
}
|
||||
"""
|
||||
Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False)
|
||||
Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False)
|
||||
|
||||
.. figure:: ../_static/feature_interaction_illustration4.png
|
||||
:align: center
|
||||
:figwidth: 80 %
|
||||
|
||||
``{1, 2, 3, 4}`` represents the sets of legitimate split features.
|
||||
|
||||
This has lead to some interesting implications of feature interaction constraints. Take
|
||||
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
|
||||
features in our training datasets for presentation purpose, careful readers might have
|
||||
found out that the above constraint is same with ``[0, 1, 2]``. Since no matter which
|
||||
feature is chosen for split in root node, all its descendants have to include every
|
||||
feature as legitimate split candidates to avoid violating interaction constraints.
|
||||
|
||||
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
|
||||
root node. At the second layer of built tree, ``1`` is the only legitimate split
|
||||
candidate except for ``0`` itself, since they belong to the same constraint set.
|
||||
Following the grow path of our example tree below, the node at second layer splits at
|
||||
feature ``1``. But due to the fact that ``1`` also belongs to second constraint set ``[1,
|
||||
3, 4]``, at third layer, we need to include all features as candidates to comply with its
|
||||
ascendants.
|
||||
|
||||
.. plot::
|
||||
:nofigs:
|
||||
|
||||
from graphviz import Source
|
||||
source = r"""
|
||||
digraph feature_interaction_illustration5 {
|
||||
graph [fontname = "helvetica"];
|
||||
node [fontname = "helvetica"];
|
||||
edge [fontname = "helvetica"];
|
||||
0 [label=<x<SUB><FONT POINT-SIZE="11">0</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
1 [label="...", shape=none];
|
||||
2 [label=<x<SUB><FONT POINT-SIZE="11">1</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
3 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
4 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
5 [label="...", shape=none];
|
||||
6 [label="...", shape=none];
|
||||
7 [label="...", shape=none];
|
||||
8 [label="...", shape=none];
|
||||
0 -> 1;
|
||||
0 -> 2;
|
||||
2 -> 3;
|
||||
2 -> 4;
|
||||
3 -> 5;
|
||||
3 -> 6;
|
||||
4 -> 7;
|
||||
4 -> 8;
|
||||
}
|
||||
"""
|
||||
Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False)
|
||||
Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False)
|
||||
|
||||
|
||||
.. figure:: ../_static/feature_interaction_illustration6.png
|
||||
:align: center
|
||||
:figwidth: 80 %
|
||||
|
||||
``{0, 1, 3, 4}`` represents the sets of legitimate split features.
|
||||
|
||||
Reference in New Issue
Block a user