Update doc for feature constraints and n_gpus. (#4596)

* Update doc for feature constraints. * Fix some warnings. * Clean up doc for `n_gpus`.
2019-06-23 14:37:22 +08:00 · 2019-06-23 14:37:22 +08:00 · 2cff735126
commit 2cff735126
parent 9fa29ad753
5 changed files with 172 additions and 82 deletions
--- a/doc/gpu/index.rst
+++ b/doc/gpu/index.rst
@ -50,7 +50,7 @@ Supported parameters
 +--------------------------------+----------------------------+--------------+
 | ``gpu_id``                     | |tick|                     | |tick|       |
 +--------------------------------+----------------------------+--------------+
-| ``n_gpus``                     | |cross|                    | |tick|       |
+| ``n_gpus`` (deprecated)        | |cross|                    | |tick|       |
 +--------------------------------+----------------------------+--------------+
 | ``predictor``                  | |tick|                     | |tick|       |
 +--------------------------------+----------------------------+--------------+
@ -58,6 +58,8 @@ Supported parameters
 +--------------------------------+----------------------------+--------------+
 | ``monotone_constraints``       | |cross|                    | |tick|       |
 +--------------------------------+----------------------------+--------------+
 | ``interaction_constraints``    | |cross|                    | |tick|       |
 +--------------------------------+----------------------------+--------------+
 | ``single_precision_histogram`` | |cross|                    | |tick|       |
 +--------------------------------+----------------------------+--------------+
@ -65,7 +67,8 @@ GPU accelerated prediction is enabled by default for the above mentioned ``tree_
 The experimental parameter ``single_precision_histogram`` can be set to True to enable building histograms using single precision. This may improve speed, in particular on older architectures.
-The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.
+The device ordinal (which GPU to use if you have many of them) can be selected using the
 ``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
 The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
@ -80,15 +83,7 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
 Single Node Multi-GPU
 =====================
-.. note:: Single node multi-GPU training is deprecated. Please use distributed GPU training with one process per GPU.
+.. note:: Single node multi-GPU training with `n_gpus` parameter is deprecated after 0.90.  Please use distributed GPU training with one process per GPU.
 Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used.  If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system.  As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
 .. note:: Enabling multi-GPU training
  Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
 XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.
 Multi-node Multi-GPU Training
 =============================
@ -101,66 +96,64 @@ Objective functions
 ===================
 Most of the objective functions implemented in XGBoost can be run on GPU.  Following table shows current support status.
-.. |tick| unicode:: U+2714
+--------------------+-------------+
-.. |cross| unicode:: U+2718
+| Objectives         | GPU support |
 +--------------------+-------------+
 | reg:squarederror   | |tick|      |
 +--------------------+-------------+
 | reg:squaredlogerror| |tick|      |
 +--------------------+-------------+
 | reg:logistic       | |tick|      |
 +--------------------+-------------+
 | binary:logistic    | |tick|      |
 +--------------------+-------------+
 | binary:logitraw    | |tick|      |
 +--------------------+-------------+
 | binary:hinge       | |tick|      |
 +--------------------+-------------+
 | count:poisson      | |tick|      |
 +--------------------+-------------+
 | reg:gamma          | |tick|      |
 +--------------------+-------------+
 | reg:tweedie        | |tick|      |
 +--------------------+-------------+
 | multi:softmax      | |tick|      |
 +--------------------+-------------+
 | multi:softprob     | |tick|      |
 +--------------------+-------------+
 | survival:cox       | |cross|     |
 +--------------------+-------------+
 | rank:pairwise      | |cross|     |
 +--------------------+-------------+
 | rank:ndcg          | |cross|     |
 +--------------------+-------------+
 | rank:map           | |cross|     |
 +--------------------+-------------+
-+-----------------+-------------+
+Objective will run on GPU if GPU updater (``gpu_hist``), otherwise they will run on CPU by
-| Objectives      | GPU support |
+default.  For unsupported objectives XGBoost will fall back to using CPU implementation by
-+-----------------+-------------+
+default.
 | reg:squarederror| |tick|      |
 +-----------------+-------------+
 | reg:logistic    | |tick|      |
 +-----------------+-------------+
 | binary:logistic | |tick|      |
 +-----------------+-------------+
 | binary:logitraw | |tick|      |
 +-----------------+-------------+
 | binary:hinge    | |tick|      |
 +-----------------+-------------+
 | count:poisson   | |tick|      |
 +-----------------+-------------+
 | reg:gamma       | |tick|      |
 +-----------------+-------------+
 | reg:tweedie     | |tick|      |
 +-----------------+-------------+
 | multi:softmax   | |tick|      |
 +-----------------+-------------+
 | multi:softprob  | |tick|      |
 +-----------------+-------------+
 | survival:cox    | |cross|     |
 +-----------------+-------------+
 | rank:pairwise   | |cross|     |
 +-----------------+-------------+
 | rank:ndcg       | |cross|     |
 +-----------------+-------------+
 | rank:map        | |cross|     |
 +-----------------+-------------+
 For multi-gpu support, objective functions also honor the ``n_gpus`` parameter,
 which, by default is set to 1.  To disable running objectives on GPU, just set
 ``n_gpus`` to 0.
 Metric functions
 ===================
 Following table shows current support status for evaluation metrics on the GPU.
 .. |tick| unicode:: U+2714
 .. |cross| unicode:: U+2718
 +-----------------+-------------+
 | Metric          | GPU Support |
 +=================+=============+
 | rmse            | |tick|      |
 +-----------------+-------------+
 | rmsle           | |tick|      |
 +-----------------+-------------+
 | mae             | |tick|      |
 +-----------------+-------------+
 | logloss         | |tick|      |
 +-----------------+-------------+
 | error           | |tick|      |
 +-----------------+-------------+
-| merror          | |cross|     |
+| merror          | |tick|      |
 +-----------------+-------------+
-| mlogloss        | |cross|     |
+| mlogloss        | |tick|      |
 +-----------------+-------------+
 | auc             | |cross|     |
 +-----------------+-------------+
@ -181,10 +174,8 @@ Following table shows current support status for evaluation metrics on the GPU.
 | tweedie-nloglik | |tick|      |
 +-----------------+-------------+
-As for objective functions, metrics honor the ``n_gpus`` parameter,
+Similar to objective functions, default device for metrics is selected based on tree
-which, by default is set to 1.  To disable running metrics on GPU, just set
+updater and predictor (which is selected based on tree updater).
 ``n_gpus`` to 0.
 Benchmarks
 ==========
--- a/doc/tutorials/feature_interaction_constraint.rst
+++ b/doc/tutorials/feature_interaction_constraint.rst
@ -171,7 +171,107 @@ parameter:
                                     num_boost_round = 1000, evals = evallist,
                                     early_stopping_rounds = 10)
-**Choice of tree construction algorithm**. To use feature interaction
+**Choice of tree construction algorithm**. To use feature interaction constraints, be sure
-constraints, be sure to set the ``tree_method`` parameter to either ``exact``
+to set the ``tree_method`` parameter to one of the following: ``exact``, ``hist`` or
-or ``hist``. Currently, GPU algorithms (``gpu_hist``, ``gpu_exact``) do not
+``gpu_hist``.  Support for ``gpu_hist`` is added after (excluding) version 0.90.
-support feature interaction constraints.
+
 **************
 Advanced topic
 **************
 The intuition behind interaction constraint is simple.  User have prior knowledge about
 relations between different features, and encode it as constraints during model
 construction.  But there are also some subtleties around specifying constraints.  Take
 constraint ``[[1, 2], [2, 3, 4]]`` as an example, the second feature appears in two
 different interaction sets ``[1, 2]`` and ``[2, 3, 4]``, so the union set of features
 allowed to interact with ``2`` is ``{1, 3, 4}``.  In following diagram, root splits at
 feature ``2``.  because all its descendants should be able to interact with it, so at the
 second layer all 4 features are legitimate split candidates for further splitting,
 disregarding specified constraint sets.
 .. plot::
  :nofigs:
  from graphviz import Source
  source = r"""
    digraph feature_interaction_illustration4 {
      graph [fontname = "helvetica"];
      node [fontname = "helvetica"];
      edge [fontname = "helvetica"];
      0 [label=<x<SUB><FONT POINT-SIZE="11">2</FONT></SUB>>, shape=box, color=black, fontcolor=black];
      1 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box];
      2 [label=<x<SUB><FONT POINT-SIZE="11">{1, 2, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
      3 [label="...", shape=none];
      4 [label="...", shape=none];
      5 [label="...", shape=none];
      6 [label="...", shape=none];
      0 -> 1;
      0 -> 2;
      1 -> 3;
      1 -> 4;
      2 -> 5;
      2 -> 6;
    }
  """
  Source(source, format='png').render('../_static/feature_interaction_illustration4', view=False)
  Source(source, format='svg').render('../_static/feature_interaction_illustration5', view=False)
 .. figure:: ../_static/feature_interaction_illustration4.png
   :align: center
   :figwidth: 80 %
   ``{1, 2, 3, 4}`` represents the sets of legitimate split features.
 This has lead to some interesting implications of feature interaction constraints.  Take
 ``[[0, 1], [0, 1, 2], [1, 2]]`` as another example.  Assuming we have only 3 available
 features in our training datasets for presentation purpose, careful readers might have
 found out that the above constraint is same with ``[0, 1, 2]``.  Since no matter which
 feature is chosen for split in root node, all its descendants have to include every
 feature as legitimate split candidates to avoid violating interaction constraints.
 For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
 root node.  At the second layer of built tree, ``1`` is the only legitimate split
 candidate except for ``0`` itself, since they belong to the same constraint set.
 Following the grow path of our example tree below, the node at second layer splits at
 feature ``1``.  But due to the fact that ``1`` also belongs to second constraint set ``[1,
 3, 4]``, at third layer, we need to include all features as candidates to comply with its
 ascendants.
 .. plot::
  :nofigs:
  from graphviz import Source
  source = r"""
    digraph feature_interaction_illustration5 {
      graph [fontname = "helvetica"];
      node [fontname = "helvetica"];
      edge [fontname = "helvetica"];
      0 [label=<x<SUB><FONT POINT-SIZE="11">0</FONT></SUB>>, shape=box, color=black, fontcolor=black];
      1 [label="...", shape=none];
      2 [label=<x<SUB><FONT POINT-SIZE="11">1</FONT></SUB>>, shape=box, color=black, fontcolor=black];
      3 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
      4 [label=<x<SUB><FONT POINT-SIZE="11">{0, 1, 3, 4}</FONT></SUB>>, shape=box, color=black, fontcolor=black];
      5 [label="...", shape=none];
      6 [label="...", shape=none];
      7 [label="...", shape=none];
      8 [label="...", shape=none];
      0 -> 1;
      0 -> 2;
      2 -> 3;
      2 -> 4;
      3 -> 5;
      3 -> 6;
      4 -> 7;
      4 -> 8;
    }
  """
  Source(source, format='png').render('../_static/feature_interaction_illustration6', view=False)
  Source(source, format='svg').render('../_static/feature_interaction_illustration7', view=False)
 .. figure:: ../_static/feature_interaction_illustration6.png
   :align: center
   :figwidth: 80 %
   ``{0, 1, 3, 4}`` represents the sets of legitimate split features.
--- a/python-package/xgboost/dask.py
+++ b/python-package/xgboost/dask.py
@ -101,24 +101,25 @@ def _run_with_rabit(rabit_args, func, *args):
 def run(client, func, *args):
-    """
+    """Launch arbitrary function on dask workers. Workers are connected by rabit,
-    Launch arbitrary function on dask workers. Workers are connected by rabit, allowing
+    allowing distributed training. The environment variable OMP_NUM_THREADS is
-    distributed training. The environment variable OMP_NUM_THREADS is defined on each worker
+    defined on each worker according to dask - this means that calls to
-    according to dask - this means that calls to xgb.train() will use the threads allocated by
+    xgb.train() will use the threads allocated by dask by default, unless the
-    dask by default, unless the user overrides the nthread parameter.
+    user overrides the nthread parameter.
-    Note: Windows platforms are not officially supported. Contributions are welcome here.
+    Note: Windows platforms are not officially
      supported. Contributions are welcome here.
    :param client: Dask client representing the cluster
-    :param func: Python function to be executed by each worker. Typically contains xgboost
+    :param func: Python function to be executed by each worker. Typically
-    training code.
+       contains xgboost training code.
    :param args: Arguments to be forwarded to func
    :return: Dict containing the function return value for each worker
    """
    if platform.system() == 'Windows':
-        logging.warning(
+        logging.warning('Windows is not officially supported for dask/xgboost'
-            'Windows is not officially supported for dask/xgboost integration. Contributions '
+                        'integration. Contributions welcome.')
            'welcome.')
    workers = list(client.scheduler_info()['workers'].keys())
    env = client.run(_start_tracker, len(workers), workers=[workers[0]])
    rabit_args = [('%s=%s' % item).encode() for item in env[workers[0]].items()]
--- a/python-package/xgboost/plotting.py
+++ b/python-package/xgboost/plotting.py
@ -184,18 +184,16 @@ def to_graphviz(booster, fmap='', num_trees=0, rankdir='UT',
    no_color : str, default '#FF0000'
        Edge color when doesn't meet the node condition.
    condition_node_params : dict (optional)
-        condition node configuration,
+      condition node configuration,
-        {'shape':'box',
+      {'shape':'box',
-               'style':'filled,rounded',
+       'style':'filled,rounded',
-               'fillcolor':'#78bceb'
+       'fillcolor':'#78bceb'}
        }
    leaf_node_params : dict (optional)
        leaf node configuration
        {'shape':'box',
-               'style':'filled',
+         'style':'filled',
-               'fillcolor':'#e48038'
+         'fillcolor':'#e48038'}
        }
    kwargs :
        Other keywords passed to graphviz graph_attr
--- a/python-package/xgboost/sklearn.py
+++ b/python-package/xgboost/sklearn.py
@ -105,8 +105,8 @@ class XGBModel(XGBModelBase):
        Value in the data which needs to be present as a missing value. If
        None, defaults to np.nan.
    importance_type: string, default "gain"
-        The feature importance type for the feature_importances_ property: either "gain",
+        The feature importance type for the feature_importances\\_ property:
-        "weight", "cover", "total_gain" or "total_cover".
+        either "gain", "weight", "cover", "total_gain" or "total_cover".
    \\*\\*kwargs : dict, optional
        Keyword arguments for XGBoost Booster object.  Full documentation of parameters can
        be found here: https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst.