Update doc for feature constraints and n_gpus. (#4596)

* Update doc for feature constraints. 

* Fix some warnings.

* Clean up doc for `n_gpus`.
This commit is contained in:
Jiaming Yuan 2019-06-23 14:37:22 +08:00 committed by GitHub
parent 9fa29ad753
commit 2cff735126
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 172 additions and 82 deletions

View File

@ -50,7 +50,7 @@ Supported parameters
+--------------------------------+----------------------------+--------------+
| ``gpu_id`` | |tick| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``n_gpus`` | |cross| | |tick| |
| ``n_gpus`` (deprecated) | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``predictor`` | |tick| | |tick| |
+--------------------------------+----------------------------+--------------+
@ -58,6 +58,8 @@ Supported parameters
+--------------------------------+----------------------------+--------------+
| ``monotone_constraints`` | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``interaction_constraints`` | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
| ``single_precision_histogram`` | |cross| | |tick| |
+--------------------------------+----------------------------+--------------+
@ -65,7 +67,8 @@ GPU accelerated prediction is enabled by default for the above mentioned ``tree_
The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures.
The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.
The device ordinal (which GPU to use if you have many of them) can be selected using the
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
Single Node Multi-GPU
=====================
.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU.
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
.. note:: Enabling multi-GPU training
Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.
.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90. Please use distributed GPU training with one process per GPU.
Multi-node Multi-GPU Training
=============================
@ -101,66 +96,64 @@ Objective functions
===================
Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status.
.. |tick| unicode:: U+2714
.. |cross| unicode:: U+2718
+--------------------+-------------+
| Objectives | GPU support |
+--------------------+-------------+
| reg:squarederror | |tick| |
+--------------------+-------------+
| reg:squaredlogerror| |tick| |
+--------------------+-------------+
| reg:logistic | |tick| |
+--------------------+-------------+
| binary:logistic | |tick| |
+--------------------+-------------+
| binary:logitraw | |tick| |
+--------------------+-------------+
| binary:hinge | |tick| |
+--------------------+-------------+
| count:poisson | |tick| |
+--------------------+-------------+
| reg:gamma | |tick| |
+--------------------+-------------+
| reg:tweedie | |tick| |
+--------------------+-------------+
| multi:softmax | |tick| |
+--------------------+-------------+
| multi:softprob | |tick| |
+--------------------+-------------+
| survival:cox | |cross| |
+--------------------+-------------+
| rank:pairwise | |cross| |
+--------------------+-------------+
| rank:ndcg | |cross| |
+--------------------+-------------+
| rank:map | |cross| |
+--------------------+-------------+
+-----------------+-------------+
| Objectives | GPU support |
+-----------------+-------------+
| reg:squarederror| |tick| |
+-----------------+-------------+
| reg:logistic | |tick| |
+-----------------+-------------+
| binary:logistic | |tick| |
+-----------------+-------------+
| binary:logitraw | |tick| |
+-----------------+-------------+
| binary:hinge | |tick| |
+-----------------+-------------+
| count:poisson | |tick| |
+-----------------+-------------+
| reg:gamma | |tick| |
+-----------------+-------------+
| reg:tweedie | |tick| |
+-----------------+-------------+
| multi:softmax | |tick| |
+-----------------+-------------+
| multi:softprob | |tick| |
+-----------------+-------------+
| survival:cox | |cross| |
+-----------------+-------------+
| rank:pairwise | |cross| |
+-----------------+-------------+
| rank:ndcg | |cross| |
+-----------------+-------------+
| rank:map | |cross| |
+-----------------+-------------+
For multi-gpu support, objective functions also honor the ``n_gpus`` parameter,
which, by default is set to 1. To disable running objectives on GPU, just set
``n_gpus`` to 0.
Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by
default. For unsupported objectives XGBoost will fall back to using CPU implementation by
default.
Metric functions
===================
Following table shows current support status for evaluation metrics on the GPU.
.. |tick| unicode:: U+2714
.. |cross| unicode:: U+2718
+-----------------+-------------+
| Metric | GPU Support |
+=================+=============+
| rmse | |tick| |
+-----------------+-------------+
| rmsle | |tick| |
+-----------------+-------------+
| mae | |tick| |
+-----------------+-------------+
| logloss | |tick| |
+-----------------+-------------+
| error | |tick| |
+-----------------+-------------+
| merror | |cross| |
| merror | |tick| |
+-----------------+-------------+
| mlogloss | |cross| |
| mlogloss | |tick| |
+-----------------+-------------+
| auc | |cross| |
+-----------------+-------------+
@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU.
| tweedie-nloglik | |tick| |
+-----------------+-------------+
As for objective functions, metrics honor the ``n_gpus`` parameter,
which, by default is set to 1. To disable running metrics on GPU, just set
``n_gpus`` to 0.
Similar to objective functions, default device for metrics is selected based on tree
updater and predictor (which is selected based on tree updater).
Benchmarks
==========

View File

@ -171,7 +171,107 @@ parameter:
num_boost_round = 1000, evals = evallist,
early_stopping_rounds = 10)
**Choice of tree construction algorithm**. To use feature interaction
constraints, be sure to set the ``tree_method`` parameter to either ``exact``
or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not
support feature interaction constraints.
**Choice of tree construction algorithm**. To use feature interaction constraints, be sure
to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or
``gpu_hist``. Support for ``gpu_hist`` is added after (excluding) version 0.90.
**************
Advanced topic
**************
The intuition behind interaction constraint is simple. User have prior knowledge about
relations between different features, and encode it as constraints during model
construction. But there are also some subtleties around specifying constraints. Take
constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two
different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features
allowed to interact with ``2`` is ``{1, 3, 4}``. In following diagram, root splits at
feature ``2``. because all its descendants should be able to interact with it, so at the
second layer all 4 features are legitimate split candidates for further splitting,
disregarding specified constraint sets.
.. plot::
:nofigs:
from graphviz import Source
source = r"""
digraph feature_interaction_illustration4 {
graph [fontname = "helvetica"];
node [fontname = "helvetica"];
edge [fontname = "helvetica"];
0 [label=<x<SUB><FONT POINT-SIZE="11">2</FONT></SUB>>, shape=box, color=black, fontcolor=black];
1 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box];
2 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
3 [label="...", shape=none];
4 [label="...", shape=none];
5 [label="...", shape=none];
6 [label="...", shape=none];
0 -> 1;
0 -> 2;
1 -> 3;
1 -> 4;
2 -> 5;
2 -> 6;
}
"""
Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False)
Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False)
.. figure:: ../_static/feature_interaction_illustration4.png
:align: center
:figwidth: 80 %
``{1, 2, 3, 4}`` represents the sets of legitimate split features.
This has lead to some interesting implications of feature interaction constraints. Take
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
features in our training datasets for presentation purpose, careful readers might have
found out that the above constraint is same with ``[0, 1, 2]``. Since no matter which
feature is chosen for split in root node, all its descendants have to include every
feature as legitimate split candidates to avoid violating interaction constraints.
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
root node. At the second layer of built tree, ``1`` is the only legitimate split
candidate except for ``0`` itself, since they belong to the same constraint set.
Following the grow path of our example tree below, the node at second layer splits at
feature ``1``. But due to the fact that ``1`` also belongs to second constraint set ``[1,
3, 4]``, at third layer, we need to include all features as candidates to comply with its
ascendants.
.. plot::
:nofigs:
from graphviz import Source
source = r"""
digraph feature_interaction_illustration5 {
graph [fontname = "helvetica"];
node [fontname = "helvetica"];
edge [fontname = "helvetica"];
0 [label=<x<SUB><FONT POINT-SIZE="11">0</FONT></SUB>>, shape=box, color=black, fontcolor=black];
1 [label="...", shape=none];
2 [label=<x<SUB><FONT POINT-SIZE="11">1</FONT></SUB>>, shape=box, color=black, fontcolor=black];
3 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
4 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
5 [label="...", shape=none];
6 [label="...", shape=none];
7 [label="...", shape=none];
8 [label="...", shape=none];
0 -> 1;
0 -> 2;
2 -> 3;
2 -> 4;
3 -> 5;
3 -> 6;
4 -> 7;
4 -> 8;
}
"""
Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False)
Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False)
.. figure:: ../_static/feature_interaction_illustration6.png
:align: center
:figwidth: 80 %
``{0, 1, 3, 4}`` represents the sets of legitimate split features.

View File

@ -101,24 +101,25 @@ def _run_with_rabit(rabit_args, func, *args):
def run(client, func, *args):
"""
Launch arbitrary function on dask workers. Workers are connected by rabit, allowing
distributed training. The environment variable OMP_NUM_THREADS is defined on each worker
according to dask - this means that calls to xgb.train() will use the threads allocated by
dask by default, unless the user overrides the nthread parameter.
"""Launch arbitrary function on dask workers. Workers are connected by rabit,
allowing distributed training. The environment variable OMP_NUM_THREADS is
defined on each worker according to dask - this means that calls to
xgb.train() will use the threads allocated by dask by default, unless the
user overrides the nthread parameter.
Note: Windows platforms are not officially supported. Contributions are welcome here.
Note: Windows platforms are not officially
supported. Contributions are welcome here.
:param client: Dask client representing the cluster
:param func: Python function to be executed by each worker. Typically contains xgboost
training code.
:param func: Python function to be executed by each worker. Typically
contains xgboost training code.
:param args: Arguments to be forwarded to func
:return: Dict containing the function return value for each worker
"""
if platform.system() == 'Windows':
logging.warning(
'Windows is not officially supported for dask/xgboost integration. Contributions '
'welcome.')
logging.warning('Windows is not officially supported for dask/xgboost'
'integration. Contributions welcome.')
workers = list(client.scheduler_info()['workers'].keys())
env = client.run(_start_tracker, len(workers), workers=[workers[0]])
rabit_args = [('%s=%s' % item).encode() for item in env[workers[0]].items()]

View File

@ -184,18 +184,16 @@ def to_graphviz(booster, fmap='', num_trees=0, rankdir='UT',
no_color : str, default '#FF0000'
Edge color when doesn't meet the node condition.
condition_node_params : dict (optional)
condition node configuration,
{'shape':'box',
'style':'filled,rounded',
'fillcolor':'#78bceb'
}
condition node configuration,
{'shape':'box',
'style':'filled,rounded',
'fillcolor':'#78bceb'}
leaf_node_params : dict (optional)
leaf node configuration
{'shape':'box',
'style':'filled',
'fillcolor':'#e48038'
}
'style':'filled',
'fillcolor':'#e48038'}
kwargs :
Other keywords passed to graphviz graph_attr

View File

@ -105,8 +105,8 @@ class XGBModel(XGBModelBase):
Value in the data which needs to be present as a missing value. If
None, defaults to np.nan.
importance_type: string, default "gain"
The feature importance type for the feature_importances_ property: either "gain",
"weight", "cover", "total_gain" or "total_cover".
The feature importance type for the feature_importances\\_ property:
either "gain", "weight", "cover", "total_gain" or "total_cover".
\\*\\*kwargs : dict, optional
Keyword arguments for XGBoost Booster object. Full documentation of parameters can
be found here: https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst.