diff --git a/doc/gpu/index.rst b/doc/gpu/index.rst index 43a6a7601..0f5e6317a 100644 --- a/doc/gpu/index.rst +++ b/doc/gpu/index.rst @@ -50,7 +50,7 @@ Supported parameters +--------------------------------+----------------------------+--------------+ | ``gpu_id`` | |tick| | |tick| | +--------------------------------+----------------------------+--------------+ -| ``n_gpus`` | |cross| | |tick| | +| ``n_gpus`` (deprecated) | |cross| | |tick| | +--------------------------------+----------------------------+--------------+ | ``predictor`` | |tick| | |tick| | +--------------------------------+----------------------------+--------------+ @@ -58,6 +58,8 @@ Supported parameters +--------------------------------+----------------------------+--------------+ | ``monotone_constraints`` | |cross| | |tick| | +--------------------------------+----------------------------+--------------+ +| ``interaction_constraints`` | |cross| | |tick| | ++--------------------------------+----------------------------+--------------+ | ``single_precision_histogram`` | |cross| | |tick| | +--------------------------------+----------------------------+--------------+ @@ -65,7 +67,8 @@ GPU accelerated prediction is enabled by default for the above mentioned ``tree_ The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures. -The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0. +The device ordinal (which GPU to use if you have many of them) can be selected using the +``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime). The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details. @@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu Single Node Multi-GPU ===================== -.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU. - -Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance. - -.. note:: Enabling multi-GPU training - - Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`. -XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter. - +.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90. Please use distributed GPU training with one process per GPU. Multi-node Multi-GPU Training ============================= @@ -101,66 +96,64 @@ Objective functions =================== Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status. -.. |tick| unicode:: U+2714 -.. |cross| unicode:: U+2718 ++--------------------+-------------+ +| Objectives | GPU support | ++--------------------+-------------+ +| reg:squarederror | |tick| | ++--------------------+-------------+ +| reg:squaredlogerror| |tick| | ++--------------------+-------------+ +| reg:logistic | |tick| | ++--------------------+-------------+ +| binary:logistic | |tick| | ++--------------------+-------------+ +| binary:logitraw | |tick| | ++--------------------+-------------+ +| binary:hinge | |tick| | ++--------------------+-------------+ +| count:poisson | |tick| | ++--------------------+-------------+ +| reg:gamma | |tick| | ++--------------------+-------------+ +| reg:tweedie | |tick| | ++--------------------+-------------+ +| multi:softmax | |tick| | ++--------------------+-------------+ +| multi:softprob | |tick| | ++--------------------+-------------+ +| survival:cox | |cross| | ++--------------------+-------------+ +| rank:pairwise | |cross| | ++--------------------+-------------+ +| rank:ndcg | |cross| | ++--------------------+-------------+ +| rank:map | |cross| | ++--------------------+-------------+ -+-----------------+-------------+ -| Objectives | GPU support | -+-----------------+-------------+ -| reg:squarederror| |tick| | -+-----------------+-------------+ -| reg:logistic | |tick| | -+-----------------+-------------+ -| binary:logistic | |tick| | -+-----------------+-------------+ -| binary:logitraw | |tick| | -+-----------------+-------------+ -| binary:hinge | |tick| | -+-----------------+-------------+ -| count:poisson | |tick| | -+-----------------+-------------+ -| reg:gamma | |tick| | -+-----------------+-------------+ -| reg:tweedie | |tick| | -+-----------------+-------------+ -| multi:softmax | |tick| | -+-----------------+-------------+ -| multi:softprob | |tick| | -+-----------------+-------------+ -| survival:cox | |cross| | -+-----------------+-------------+ -| rank:pairwise | |cross| | -+-----------------+-------------+ -| rank:ndcg | |cross| | -+-----------------+-------------+ -| rank:map | |cross| | -+-----------------+-------------+ - -For multi-gpu support, objective functions also honor the ``n_gpus`` parameter, -which, by default is set to 1. To disable running objectives on GPU, just set -``n_gpus`` to 0. +Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by +default. For unsupported objectives XGBoost will fall back to using CPU implementation by +default. Metric functions =================== Following table shows current support status for evaluation metrics on the GPU. -.. |tick| unicode:: U+2714 -.. |cross| unicode:: U+2718 - +-----------------+-------------+ | Metric | GPU Support | +=================+=============+ | rmse | |tick| | +-----------------+-------------+ +| rmsle | |tick| | ++-----------------+-------------+ | mae | |tick| | +-----------------+-------------+ | logloss | |tick| | +-----------------+-------------+ | error | |tick| | +-----------------+-------------+ -| merror | |cross| | +| merror | |tick| | +-----------------+-------------+ -| mlogloss | |cross| | +| mlogloss | |tick| | +-----------------+-------------+ | auc | |cross| | +-----------------+-------------+ @@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU. | tweedie-nloglik | |tick| | +-----------------+-------------+ -As for objective functions, metrics honor the ``n_gpus`` parameter, -which, by default is set to 1. To disable running metrics on GPU, just set -``n_gpus`` to 0. - +Similar to objective functions, default device for metrics is selected based on tree +updater and predictor (which is selected based on tree updater). Benchmarks ========== diff --git a/doc/tutorials/feature_interaction_constraint.rst b/doc/tutorials/feature_interaction_constraint.rst index 947778427..ea4d252ca 100644 --- a/doc/tutorials/feature_interaction_constraint.rst +++ b/doc/tutorials/feature_interaction_constraint.rst @@ -171,7 +171,107 @@ parameter: num_boost_round = 1000, evals = evallist, early_stopping_rounds = 10) -**Choice of tree construction algorithm**. To use feature interaction -constraints, be sure to set the ``tree_method`` parameter to either ``exact`` -or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not -support feature interaction constraints. +**Choice of tree construction algorithm**. To use feature interaction constraints, be sure +to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or +``gpu_hist``. Support for ``gpu_hist`` is added after (excluding) version 0.90. + + +************** +Advanced topic +************** + +The intuition behind interaction constraint is simple. User have prior knowledge about +relations between different features, and encode it as constraints during model +construction. But there are also some subtleties around specifying constraints. Take +constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two +different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features +allowed to interact with ``2`` is ``{1, 3, 4}``. In following diagram, root splits at +feature ``2``. because all its descendants should be able to interact with it, so at the +second layer all 4 features are legitimate split candidates for further splitting, +disregarding specified constraint sets. + +.. plot:: + :nofigs: + + from graphviz import Source + source = r""" + digraph feature_interaction_illustration4 { + graph [fontname = "helvetica"]; + node [fontname = "helvetica"]; + edge [fontname = "helvetica"]; + 0 [label=2>, shape=box, color=black, fontcolor=black]; + 1 [label={1, 2, 3, 4}>, shape=box]; + 2 [label={1, 2, 3, 4}>, shape=box, color=black, fontcolor=black]; + 3 [label="...", shape=none]; + 4 [label="...", shape=none]; + 5 [label="...", shape=none]; + 6 [label="...", shape=none]; + 0 -> 1; + 0 -> 2; + 1 -> 3; + 1 -> 4; + 2 -> 5; + 2 -> 6; + } + """ + Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False) + Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False) + +.. figure:: ../_static/feature_interaction_illustration4.png + :align: center + :figwidth: 80 % + + ``{1, 2, 3, 4}`` represents the sets of legitimate split features. + +This has lead to some interesting implications of feature interaction constraints. Take +``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available +features in our training datasets for presentation purpose, careful readers might have +found out that the above constraint is same with ``[0, 1, 2]``. Since no matter which +feature is chosen for split in root node, all its descendants have to include every +feature as legitimate split candidates to avoid violating interaction constraints. + +For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for +root node. At the second layer of built tree, ``1`` is the only legitimate split +candidate except for ``0`` itself, since they belong to the same constraint set. +Following the grow path of our example tree below, the node at second layer splits at +feature ``1``. But due to the fact that ``1`` also belongs to second constraint set ``[1, +3, 4]``, at third layer, we need to include all features as candidates to comply with its +ascendants. + +.. plot:: + :nofigs: + + from graphviz import Source + source = r""" + digraph feature_interaction_illustration5 { + graph [fontname = "helvetica"]; + node [fontname = "helvetica"]; + edge [fontname = "helvetica"]; + 0 [label=0>, shape=box, color=black, fontcolor=black]; + 1 [label="...", shape=none]; + 2 [label=1>, shape=box, color=black, fontcolor=black]; + 3 [label={0, 1, 3, 4}>, shape=box, color=black, fontcolor=black]; + 4 [label={0, 1, 3, 4}>, shape=box, color=black, fontcolor=black]; + 5 [label="...", shape=none]; + 6 [label="...", shape=none]; + 7 [label="...", shape=none]; + 8 [label="...", shape=none]; + 0 -> 1; + 0 -> 2; + 2 -> 3; + 2 -> 4; + 3 -> 5; + 3 -> 6; + 4 -> 7; + 4 -> 8; + } + """ + Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False) + Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False) + + +.. figure:: ../_static/feature_interaction_illustration6.png + :align: center + :figwidth: 80 % + + ``{0, 1, 3, 4}`` represents the sets of legitimate split features. diff --git a/python-package/xgboost/dask.py b/python-package/xgboost/dask.py index 18e496ffd..b0be2dbcc 100644 --- a/python-package/xgboost/dask.py +++ b/python-package/xgboost/dask.py @@ -101,24 +101,25 @@ def _run_with_rabit(rabit_args, func, *args): def run(client, func, *args): - """ - Launch arbitrary function on dask workers. Workers are connected by rabit, allowing - distributed training. The environment variable OMP_NUM_THREADS is defined on each worker - according to dask - this means that calls to xgb.train() will use the threads allocated by - dask by default, unless the user overrides the nthread parameter. + """Launch arbitrary function on dask workers. Workers are connected by rabit, + allowing distributed training. The environment variable OMP_NUM_THREADS is + defined on each worker according to dask - this means that calls to + xgb.train() will use the threads allocated by dask by default, unless the + user overrides the nthread parameter. - Note: Windows platforms are not officially supported. Contributions are welcome here. + Note: Windows platforms are not officially + supported. Contributions are welcome here. :param client: Dask client representing the cluster - :param func: Python function to be executed by each worker. Typically contains xgboost - training code. + :param func: Python function to be executed by each worker. Typically + contains xgboost training code. :param args: Arguments to be forwarded to func :return: Dict containing the function return value for each worker + """ if platform.system() == 'Windows': - logging.warning( - 'Windows is not officially supported for dask/xgboost integration. Contributions ' - 'welcome.') + logging.warning('Windows is not officially supported for dask/xgboost' + 'integration. Contributions welcome.') workers = list(client.scheduler_info()['workers'].keys()) env = client.run(_start_tracker, len(workers), workers=[workers[0]]) rabit_args = [('%s=%s' % item).encode() for item in env[workers[0]].items()] diff --git a/python-package/xgboost/plotting.py b/python-package/xgboost/plotting.py index d23f4860d..094bf8a51 100644 --- a/python-package/xgboost/plotting.py +++ b/python-package/xgboost/plotting.py @@ -184,18 +184,16 @@ def to_graphviz(booster, fmap='', num_trees=0, rankdir='UT', no_color : str, default '#FF0000' Edge color when doesn't meet the node condition. condition_node_params : dict (optional) - condition node configuration, - {'shape':'box', - 'style':'filled,rounded', - 'fillcolor':'#78bceb' - } + condition node configuration, + {'shape':'box', + 'style':'filled,rounded', + 'fillcolor':'#78bceb'} leaf_node_params : dict (optional) leaf node configuration {'shape':'box', - 'style':'filled', - 'fillcolor':'#e48038' - } + 'style':'filled', + 'fillcolor':'#e48038'} kwargs : Other keywords passed to graphviz graph_attr diff --git a/python-package/xgboost/sklearn.py b/python-package/xgboost/sklearn.py index 858129a4c..8d4f7d03f 100644 --- a/python-package/xgboost/sklearn.py +++ b/python-package/xgboost/sklearn.py @@ -105,8 +105,8 @@ class XGBModel(XGBModelBase): Value in the data which needs to be present as a missing value. If None, defaults to np.nan. importance_type: string, default "gain" - The feature importance type for the feature_importances_ property: either "gain", - "weight", "cover", "total_gain" or "total_cover". + The feature importance type for the feature_importances\\_ property: + either "gain", "weight", "cover", "total_gain" or "total_cover". \\*\\*kwargs : dict, optional Keyword arguments for XGBoost Booster object. Full documentation of parameters can be found here: https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst.