Update doc for feature constraints and n_gpus. (#4596)
* Update doc for feature constraints. * Fix some warnings. * Clean up doc for `n_gpus`.
This commit is contained in:
parent
9fa29ad753
commit
2cff735126
@ -50,7 +50,7 @@ Supported parameters
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``gpu_id`` | |tick| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``n_gpus`` | |cross| | |tick| |
|
||||
| ``n_gpus`` (deprecated) | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``predictor`` | |tick| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
@ -58,6 +58,8 @@ Supported parameters
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``monotone_constraints`` | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``interaction_constraints`` | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
| ``single_precision_histogram`` | |cross| | |tick| |
|
||||
+--------------------------------+----------------------------+--------------+
|
||||
|
||||
@ -65,7 +67,8 @@ GPU accelerated prediction is enabled by default for the above mentioned ``tree_
|
||||
|
||||
The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures.
|
||||
|
||||
The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.
|
||||
The device ordinal (which GPU to use if you have many of them) can be selected using the
|
||||
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
|
||||
|
||||
|
||||
The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
|
||||
@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
|
||||
|
||||
Single Node Multi-GPU
|
||||
=====================
|
||||
.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU.
|
||||
|
||||
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
|
||||
|
||||
.. note:: Enabling multi-GPU training
|
||||
|
||||
Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
|
||||
XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.
|
||||
|
||||
.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90. Please use distributed GPU training with one process per GPU.
|
||||
|
||||
Multi-node Multi-GPU Training
|
||||
=============================
|
||||
@ -101,66 +96,64 @@ Objective functions
|
||||
===================
|
||||
Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status.
|
||||
|
||||
.. |tick| unicode:: U+2714
|
||||
.. |cross| unicode:: U+2718
|
||||
+--------------------+-------------+
|
||||
| Objectives | GPU support |
|
||||
+--------------------+-------------+
|
||||
| reg:squarederror | |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:squaredlogerror| |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:logistic | |tick| |
|
||||
+--------------------+-------------+
|
||||
| binary:logistic | |tick| |
|
||||
+--------------------+-------------+
|
||||
| binary:logitraw | |tick| |
|
||||
+--------------------+-------------+
|
||||
| binary:hinge | |tick| |
|
||||
+--------------------+-------------+
|
||||
| count:poisson | |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:gamma | |tick| |
|
||||
+--------------------+-------------+
|
||||
| reg:tweedie | |tick| |
|
||||
+--------------------+-------------+
|
||||
| multi:softmax | |tick| |
|
||||
+--------------------+-------------+
|
||||
| multi:softprob | |tick| |
|
||||
+--------------------+-------------+
|
||||
| survival:cox | |cross| |
|
||||
+--------------------+-------------+
|
||||
| rank:pairwise | |cross| |
|
||||
+--------------------+-------------+
|
||||
| rank:ndcg | |cross| |
|
||||
+--------------------+-------------+
|
||||
| rank:map | |cross| |
|
||||
+--------------------+-------------+
|
||||
|
||||
+-----------------+-------------+
|
||||
| Objectives | GPU support |
|
||||
+-----------------+-------------+
|
||||
| reg:squarederror| |tick| |
|
||||
+-----------------+-------------+
|
||||
| reg:logistic | |tick| |
|
||||
+-----------------+-------------+
|
||||
| binary:logistic | |tick| |
|
||||
+-----------------+-------------+
|
||||
| binary:logitraw | |tick| |
|
||||
+-----------------+-------------+
|
||||
| binary:hinge | |tick| |
|
||||
+-----------------+-------------+
|
||||
| count:poisson | |tick| |
|
||||
+-----------------+-------------+
|
||||
| reg:gamma | |tick| |
|
||||
+-----------------+-------------+
|
||||
| reg:tweedie | |tick| |
|
||||
+-----------------+-------------+
|
||||
| multi:softmax | |tick| |
|
||||
+-----------------+-------------+
|
||||
| multi:softprob | |tick| |
|
||||
+-----------------+-------------+
|
||||
| survival:cox | |cross| |
|
||||
+-----------------+-------------+
|
||||
| rank:pairwise | |cross| |
|
||||
+-----------------+-------------+
|
||||
| rank:ndcg | |cross| |
|
||||
+-----------------+-------------+
|
||||
| rank:map | |cross| |
|
||||
+-----------------+-------------+
|
||||
|
||||
For multi-gpu support, objective functions also honor the ``n_gpus`` parameter,
|
||||
which, by default is set to 1. To disable running objectives on GPU, just set
|
||||
``n_gpus`` to 0.
|
||||
Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by
|
||||
default. For unsupported objectives XGBoost will fall back to using CPU implementation by
|
||||
default.
|
||||
|
||||
Metric functions
|
||||
===================
|
||||
Following table shows current support status for evaluation metrics on the GPU.
|
||||
|
||||
.. |tick| unicode:: U+2714
|
||||
.. |cross| unicode:: U+2718
|
||||
|
||||
+-----------------+-------------+
|
||||
| Metric | GPU Support |
|
||||
+=================+=============+
|
||||
| rmse | |tick| |
|
||||
+-----------------+-------------+
|
||||
| rmsle | |tick| |
|
||||
+-----------------+-------------+
|
||||
| mae | |tick| |
|
||||
+-----------------+-------------+
|
||||
| logloss | |tick| |
|
||||
+-----------------+-------------+
|
||||
| error | |tick| |
|
||||
+-----------------+-------------+
|
||||
| merror | |cross| |
|
||||
| merror | |tick| |
|
||||
+-----------------+-------------+
|
||||
| mlogloss | |cross| |
|
||||
| mlogloss | |tick| |
|
||||
+-----------------+-------------+
|
||||
| auc | |cross| |
|
||||
+-----------------+-------------+
|
||||
@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU.
|
||||
| tweedie-nloglik | |tick| |
|
||||
+-----------------+-------------+
|
||||
|
||||
As for objective functions, metrics honor the ``n_gpus`` parameter,
|
||||
which, by default is set to 1. To disable running metrics on GPU, just set
|
||||
``n_gpus`` to 0.
|
||||
|
||||
Similar to objective functions, default device for metrics is selected based on tree
|
||||
updater and predictor (which is selected based on tree updater).
|
||||
|
||||
Benchmarks
|
||||
==========
|
||||
|
||||
@ -171,7 +171,107 @@ parameter:
|
||||
num_boost_round = 1000, evals = evallist,
|
||||
early_stopping_rounds = 10)
|
||||
|
||||
**Choice of tree construction algorithm**. To use feature interaction
|
||||
constraints, be sure to set the ``tree_method`` parameter to either ``exact``
|
||||
or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not
|
||||
support feature interaction constraints.
|
||||
**Choice of tree construction algorithm**. To use feature interaction constraints, be sure
|
||||
to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or
|
||||
``gpu_hist``. Support for ``gpu_hist`` is added after (excluding) version 0.90.
|
||||
|
||||
|
||||
**************
|
||||
Advanced topic
|
||||
**************
|
||||
|
||||
The intuition behind interaction constraint is simple. User have prior knowledge about
|
||||
relations between different features, and encode it as constraints during model
|
||||
construction. But there are also some subtleties around specifying constraints. Take
|
||||
constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two
|
||||
different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features
|
||||
allowed to interact with ``2`` is ``{1, 3, 4}``. In following diagram, root splits at
|
||||
feature ``2``. because all its descendants should be able to interact with it, so at the
|
||||
second layer all 4 features are legitimate split candidates for further splitting,
|
||||
disregarding specified constraint sets.
|
||||
|
||||
.. plot::
|
||||
:nofigs:
|
||||
|
||||
from graphviz import Source
|
||||
source = r"""
|
||||
digraph feature_interaction_illustration4 {
|
||||
graph [fontname = "helvetica"];
|
||||
node [fontname = "helvetica"];
|
||||
edge [fontname = "helvetica"];
|
||||
0 [label=<x<SUB><FONT POINT-SIZE="11">2</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
1 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box];
|
||||
2 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
3 [label="...", shape=none];
|
||||
4 [label="...", shape=none];
|
||||
5 [label="...", shape=none];
|
||||
6 [label="...", shape=none];
|
||||
0 -> 1;
|
||||
0 -> 2;
|
||||
1 -> 3;
|
||||
1 -> 4;
|
||||
2 -> 5;
|
||||
2 -> 6;
|
||||
}
|
||||
"""
|
||||
Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False)
|
||||
Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False)
|
||||
|
||||
.. figure:: ../_static/feature_interaction_illustration4.png
|
||||
:align: center
|
||||
:figwidth: 80 %
|
||||
|
||||
``{1, 2, 3, 4}`` represents the sets of legitimate split features.
|
||||
|
||||
This has lead to some interesting implications of feature interaction constraints. Take
|
||||
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
|
||||
features in our training datasets for presentation purpose, careful readers might have
|
||||
found out that the above constraint is same with ``[0, 1, 2]``. Since no matter which
|
||||
feature is chosen for split in root node, all its descendants have to include every
|
||||
feature as legitimate split candidates to avoid violating interaction constraints.
|
||||
|
||||
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
|
||||
root node. At the second layer of built tree, ``1`` is the only legitimate split
|
||||
candidate except for ``0`` itself, since they belong to the same constraint set.
|
||||
Following the grow path of our example tree below, the node at second layer splits at
|
||||
feature ``1``. But due to the fact that ``1`` also belongs to second constraint set ``[1,
|
||||
3, 4]``, at third layer, we need to include all features as candidates to comply with its
|
||||
ascendants.
|
||||
|
||||
.. plot::
|
||||
:nofigs:
|
||||
|
||||
from graphviz import Source
|
||||
source = r"""
|
||||
digraph feature_interaction_illustration5 {
|
||||
graph [fontname = "helvetica"];
|
||||
node [fontname = "helvetica"];
|
||||
edge [fontname = "helvetica"];
|
||||
0 [label=<x<SUB><FONT POINT-SIZE="11">0</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
1 [label="...", shape=none];
|
||||
2 [label=<x<SUB><FONT POINT-SIZE="11">1</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
3 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
4 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
|
||||
5 [label="...", shape=none];
|
||||
6 [label="...", shape=none];
|
||||
7 [label="...", shape=none];
|
||||
8 [label="...", shape=none];
|
||||
0 -> 1;
|
||||
0 -> 2;
|
||||
2 -> 3;
|
||||
2 -> 4;
|
||||
3 -> 5;
|
||||
3 -> 6;
|
||||
4 -> 7;
|
||||
4 -> 8;
|
||||
}
|
||||
"""
|
||||
Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False)
|
||||
Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False)
|
||||
|
||||
|
||||
.. figure:: ../_static/feature_interaction_illustration6.png
|
||||
:align: center
|
||||
:figwidth: 80 %
|
||||
|
||||
``{0, 1, 3, 4}`` represents the sets of legitimate split features.
|
||||
|
||||
@ -101,24 +101,25 @@ def _run_with_rabit(rabit_args, func, *args):
|
||||
|
||||
|
||||
def run(client, func, *args):
|
||||
"""
|
||||
Launch arbitrary function on dask workers. Workers are connected by rabit, allowing
|
||||
distributed training. The environment variable OMP_NUM_THREADS is defined on each worker
|
||||
according to dask - this means that calls to xgb.train() will use the threads allocated by
|
||||
dask by default, unless the user overrides the nthread parameter.
|
||||
"""Launch arbitrary function on dask workers. Workers are connected by rabit,
|
||||
allowing distributed training. The environment variable OMP_NUM_THREADS is
|
||||
defined on each worker according to dask - this means that calls to
|
||||
xgb.train() will use the threads allocated by dask by default, unless the
|
||||
user overrides the nthread parameter.
|
||||
|
||||
Note: Windows platforms are not officially supported. Contributions are welcome here.
|
||||
Note: Windows platforms are not officially
|
||||
supported. Contributions are welcome here.
|
||||
|
||||
:param client: Dask client representing the cluster
|
||||
:param func: Python function to be executed by each worker. Typically contains xgboost
|
||||
training code.
|
||||
:param func: Python function to be executed by each worker. Typically
|
||||
contains xgboost training code.
|
||||
:param args: Arguments to be forwarded to func
|
||||
:return: Dict containing the function return value for each worker
|
||||
|
||||
"""
|
||||
if platform.system() == 'Windows':
|
||||
logging.warning(
|
||||
'Windows is not officially supported for dask/xgboost integration. Contributions '
|
||||
'welcome.')
|
||||
logging.warning('Windows is not officially supported for dask/xgboost'
|
||||
'integration. Contributions welcome.')
|
||||
workers = list(client.scheduler_info()['workers'].keys())
|
||||
env = client.run(_start_tracker, len(workers), workers=[workers[0]])
|
||||
rabit_args = [('%s=%s' % item).encode() for item in env[workers[0]].items()]
|
||||
|
||||
@ -184,18 +184,16 @@ def to_graphviz(booster, fmap='', num_trees=0, rankdir='UT',
|
||||
no_color : str, default '#FF0000'
|
||||
Edge color when doesn't meet the node condition.
|
||||
condition_node_params : dict (optional)
|
||||
condition node configuration,
|
||||
{'shape':'box',
|
||||
'style':'filled,rounded',
|
||||
'fillcolor':'#78bceb'
|
||||
}
|
||||
condition node configuration,
|
||||
{'shape':'box',
|
||||
'style':'filled,rounded',
|
||||
'fillcolor':'#78bceb'}
|
||||
|
||||
leaf_node_params : dict (optional)
|
||||
leaf node configuration
|
||||
{'shape':'box',
|
||||
'style':'filled',
|
||||
'fillcolor':'#e48038'
|
||||
}
|
||||
'style':'filled',
|
||||
'fillcolor':'#e48038'}
|
||||
|
||||
kwargs :
|
||||
Other keywords passed to graphviz graph_attr
|
||||
|
||||
@ -105,8 +105,8 @@ class XGBModel(XGBModelBase):
|
||||
Value in the data which needs to be present as a missing value. If
|
||||
None, defaults to np.nan.
|
||||
importance_type: string, default "gain"
|
||||
The feature importance type for the feature_importances_ property: either "gain",
|
||||
"weight", "cover", "total_gain" or "total_cover".
|
||||
The feature importance type for the feature_importances\\_ property:
|
||||
either "gain", "weight", "cover", "total_gain" or "total_cover".
|
||||
\\*\\*kwargs : dict, optional
|
||||
Keyword arguments for XGBoost Booster object. Full documentation of parameters can
|
||||
be found here: https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user