Update doc for feature constraints and n_gpus. (#4596)

* Update doc for feature constraints. 

* Fix some warnings.

* Clean up doc for `n_gpus`.
This commit is contained in:
Jiaming Yuan
2019-06-23 14:37:22 +08:00
committed by GitHub
parent 9fa29ad753
commit 2cff735126
5 changed files with 172 additions and 82 deletions

View File

@@ -50,7 +50,7 @@ Supported parameters
+--------------------------------+----------------------------+--------------+
| ``gpu_id`` | |tick| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``n_gpus`` | |cross| | |tick| |
| ``n_gpus`` (deprecated) | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``predictor`` | |tick| | |tick| |
+--------------------------------+----------------------------+--------------+
@@ -58,6 +58,8 @@ Supported parameters
+--------------------------------+----------------------------+--------------+
| ``monotone_constraints`` | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``interaction_constraints`` | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``single_precision_histogram`` | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
@@ -65,7 +67,8 @@ GPU accelerated prediction is enabled by default for the above mentioned ``tree_
The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures.
The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.
The device ordinal (which GPU to use if you have many of them) can be selected using the
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
@@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
Single Node Multi-GPU
=====================
.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU.
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
.. note:: Enabling multi-GPU training
Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.
.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90. Please use distributed GPU training with one process per GPU.
Multi-node Multi-GPU Training
=============================
@@ -101,66 +96,64 @@ Objective functions
===================
Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status.
.. |tick| unicode:: U+2714
.. |cross| unicode:: U+2718
+--------------------+-------------+
| Objectives | GPU support |
+--------------------+-------------+
| reg:squarederror | |tick| |
+--------------------+-------------+
| reg:squaredlogerror| |tick| |
+--------------------+-------------+
| reg:logistic | |tick| |
+--------------------+-------------+
| binary:logistic | |tick| |
+--------------------+-------------+
| binary:logitraw | |tick| |
+--------------------+-------------+
| binary:hinge | |tick| |
+--------------------+-------------+
| count:poisson | |tick| |
+--------------------+-------------+
| reg:gamma | |tick| |
+--------------------+-------------+
| reg:tweedie | |tick| |
+--------------------+-------------+
| multi:softmax | |tick| |
+--------------------+-------------+
| multi:softprob | |tick| |
+--------------------+-------------+
| survival:cox | |cross| |
+--------------------+-------------+
| rank:pairwise | |cross| |
+--------------------+-------------+
| rank:ndcg | |cross| |
+--------------------+-------------+
| rank:map | |cross| |
+--------------------+-------------+
+-----------------+-------------+
| Objectives | GPU support |
+-----------------+-------------+
| reg:squarederror| |tick| |
+-----------------+-------------+
| reg:logistic | |tick| |
+-----------------+-------------+
| binary:logistic | |tick| |
+-----------------+-------------+
| binary:logitraw | |tick| |
+-----------------+-------------+
| binary:hinge | |tick| |
+-----------------+-------------+
| count:poisson | |tick| |
+-----------------+-------------+
| reg:gamma | |tick| |
+-----------------+-------------+
| reg:tweedie | |tick| |
+-----------------+-------------+
| multi:softmax | |tick| |
+-----------------+-------------+
| multi:softprob | |tick| |
+-----------------+-------------+
| survival:cox | |cross| |
+-----------------+-------------+
| rank:pairwise | |cross| |
+-----------------+-------------+
| rank:ndcg | |cross| |
+-----------------+-------------+
| rank:map | |cross| |
+-----------------+-------------+
For multi-gpu support, objective functions also honor the ``n_gpus`` parameter,
which, by default is set to 1. To disable running objectives on GPU, just set
``n_gpus`` to 0.
Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by
default. For unsupported objectives XGBoost will fall back to using CPU implementation by
default.
Metric functions
===================
Following table shows current support status for evaluation metrics on the GPU.
.. |tick| unicode:: U+2714
.. |cross| unicode:: U+2718
+-----------------+-------------+
| Metric | GPU Support |
+=================+=============+
| rmse | |tick| |
+-----------------+-------------+
| rmsle | |tick| |
+-----------------+-------------+
| mae | |tick| |
+-----------------+-------------+
| logloss | |tick| |
+-----------------+-------------+
| error | |tick| |
+-----------------+-------------+
| merror | |cross| |
| merror | |tick| |
+-----------------+-------------+
| mlogloss | |cross| |
| mlogloss | |tick| |
+-----------------+-------------+
| auc | |cross| |
+-----------------+-------------+
@@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU.
| tweedie-nloglik | |tick| |
+-----------------+-------------+
As for objective functions, metrics honor the ``n_gpus`` parameter,
which, by default is set to 1. To disable running metrics on GPU, just set
``n_gpus`` to 0.
Similar to objective functions, default device for metrics is selected based on tree
updater and predictor (which is selected based on tree updater).
Benchmarks
==========

View File

@@ -171,7 +171,107 @@ parameter:
num_boost_round = 1000, evals = evallist,
early_stopping_rounds = 10)
**Choice of tree construction algorithm**. To use feature interaction
constraints, be sure to set the ``tree_method`` parameter to either ``exact``
or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not
support feature interaction constraints.
**Choice of tree construction algorithm**. To use feature interaction constraints, be sure
to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or
``gpu_hist``. Support for ``gpu_hist`` is added after (excluding) version 0.90.
**************
Advanced topic
**************
The intuition behind interaction constraint is simple. User have prior knowledge about
relations between different features, and encode it as constraints during model
construction. But there are also some subtleties around specifying constraints. Take
constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two
different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features
allowed to interact with ``2`` is ``{1, 3, 4}``. In following diagram, root splits at
feature ``2``. because all its descendants should be able to interact with it, so at the
second layer all 4 features are legitimate split candidates for further splitting,
disregarding specified constraint sets.
.. plot::
:nofigs:
from graphviz import Source
source = r"""
digraph feature_interaction_illustration4 {
graph [fontname = "helvetica"];
node [fontname = "helvetica"];
edge [fontname = "helvetica"];
0 [label=<x<SUB><FONT POINT-SIZE="11">2</FONT></SUB>>, shape=box, color=black, fontcolor=black];
1 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box];
2 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
3 [label="...", shape=none];
4 [label="...", shape=none];
5 [label="...", shape=none];
6 [label="...", shape=none];
0 -> 1;
0 -> 2;
1 -> 3;
1 -> 4;
2 -> 5;
2 -> 6;
}
"""
Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False)
Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False)
.. figure:: ../_static/feature_interaction_illustration4.png
:align: center
:figwidth: 80 %
``{1, 2, 3, 4}`` represents the sets of legitimate split features.
This has lead to some interesting implications of feature interaction constraints. Take
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
features in our training datasets for presentation purpose, careful readers might have
found out that the above constraint is same with ``[0, 1, 2]``. Since no matter which
feature is chosen for split in root node, all its descendants have to include every
feature as legitimate split candidates to avoid violating interaction constraints.
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
root node. At the second layer of built tree, ``1`` is the only legitimate split
candidate except for ``0`` itself, since they belong to the same constraint set.
Following the grow path of our example tree below, the node at second layer splits at
feature ``1``. But due to the fact that ``1`` also belongs to second constraint set ``[1,
3, 4]``, at third layer, we need to include all features as candidates to comply with its
ascendants.
.. plot::
:nofigs:
from graphviz import Source
source = r"""
digraph feature_interaction_illustration5 {
graph [fontname = "helvetica"];
node [fontname = "helvetica"];
edge [fontname = "helvetica"];
0 [label=<x<SUB><FONT POINT-SIZE="11">0</FONT></SUB>>, shape=box, color=black, fontcolor=black];
1 [label="...", shape=none];
2 [label=<x<SUB><FONT POINT-SIZE="11">1</FONT></SUB>>, shape=box, color=black, fontcolor=black];
3 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
4 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
5 [label="...", shape=none];
6 [label="...", shape=none];
7 [label="...", shape=none];
8 [label="...", shape=none];
0 -> 1;
0 -> 2;
2 -> 3;
2 -> 4;
3 -> 5;
3 -> 6;
4 -> 7;
4 -> 8;
}
"""
Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False)
Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False)
.. figure:: ../_static/feature_interaction_illustration6.png
:align: center
:figwidth: 80 %
``{0, 1, 3, 4}`` represents the sets of legitimate split features.