[Breaking] Remove rabit support for custom reductions and grow_local_histmaker updater (#7992)

2022-06-21 00:08:23 -07:00
parent 4a87ea49b8
commit e5ec546da5
17 changed files with 36 additions and 1100 deletions
--- a/doc/parameter.rst
+++ b/doc/parameter.rst
@@ -151,15 +151,6 @@ Parameters for Tree Booster
    - ``hist``: Faster histogram optimized approximate greedy algorithm.
    - ``gpu_hist``: GPU implementation of ``hist`` algorithm.

-* ``sketch_eps`` [default=0.03]
-
-  - Only used for ``updater=grow_local_histmaker``.
-  - This roughly translates into ``O(1 / sketch_eps)`` number of bins.
-    Compared to directly select number of bins, this comes with theoretical guarantee with sketch accuracy.
-  - Usually user does not have to tune this.
-    But consider setting to a lower number for more accurate enumeration of split candidates.
-  - range: (0, 1)
-
 * ``scale_pos_weight`` [default=1]

  - Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: ``sum(negative instances) / sum(positive instances)``. See :doc:`Parameters Tuning </tutorials/param_tuning>` for more discussion. Also, see Higgs Kaggle competition demo for examples: `R <https://github.com/dmlc/xgboost/blob/master/demo/kaggle-higgs/higgs-train.R>`_, `py1 <https://github.com/dmlc/xgboost/blob/master/demo/kaggle-higgs/higgs-numpy.py>`_, `py2 <https://github.com/dmlc/xgboost/blob/master/demo/kaggle-higgs/higgs-cv.py>`_, `py3 <https://github.com/dmlc/xgboost/blob/master/demo/guide-python/cross_validation.py>`_.
@@ -170,7 +161,6 @@ Parameters for Tree Booster

    - ``grow_colmaker``: non-distributed column-based construction of trees.
    - ``grow_histmaker``: distributed tree construction with row-based data splitting based on global proposal of histogram counting.
-    - ``grow_local_histmaker``: based on local histogram counting.
    - ``grow_quantile_histmaker``: Grow tree using quantized histogram.
    - ``grow_gpu_hist``: Grow tree with GPU.
    - ``sync``: synchronizes trees in all distributed nodes.
--- a/doc/treemethod.rst
+++ b/doc/treemethod.rst
@@ -5,7 +5,7 @@ Tree Methods
 For training boosted tree models, there are 2 parameters used for choosing algorithms,
 namely ``updater`` and ``tree_method``.  XGBoost has 4 builtin tree methods, namely
 ``exact``, ``approx``, ``hist`` and ``gpu_hist``.  Along with these tree methods, there
-are also some free standing updaters including ``grow_local_histmaker``, ``refresh``,
+are also some free standing updaters including ``refresh``,
 ``prune`` and ``sync``.  The parameter ``updater`` is more primitive than ``tree_method``
 as the latter is just a pre-configuration of the former.  The difference is mostly due to
 historical reasons that each updater requires some specific configurations and might has
@@ -37,27 +37,18 @@ approximated training algorithms.  These algorithms build a gradient histogram f
 node and iterate through the histogram instead of real dataset.  Here we introduce the
 implementations in XGBoost below.

-1. ``grow_local_histmaker`` updater: An approximation tree method described in `reference
-   paper <http://arxiv.org/abs/1603.02754>`_.  This updater is rarely used in practice so
-   it's still an updater rather than tree method.  During split finding, it first runs a
-   weighted GK sketching for data points belong to current node to find split candidates,
-   using hessian as weights.  The histogram is built upon this per-node sketch.  It's
-   faster than ``exact`` in some applications, but still slow in computation.
+1. ``approx`` tree method: An approximation tree method described in `reference paper
+   <http://arxiv.org/abs/1603.02754>`_.  It runs sketching before building each tree
+   using all the rows (rows belonging to the root). Hessian is used as weights during
+   sketch.  The algorithm can be accessed by setting ``tree_method`` to ``approx``.

-2. ``approx`` tree method: An approximation tree method described in `reference paper
-   <http://arxiv.org/abs/1603.02754>`_.  Different from ``grow_local_histmaker``, it runs
-   sketching before building each tree using all the rows (rows belonging to the root)
-   instead of per-node dataset.  Similar to ``grow_local_histmaker`` updater, hessian is
-   used as weights during sketch.  The algorithm can be accessed by setting
-   ``tree_method`` to ``approx``.
-
-3. ``hist`` tree method: An approximation tree method used in LightGBM with slight
+2. ``hist`` tree method: An approximation tree method used in LightGBM with slight
   differences in implementation.  It runs sketching before training using only user
   provided weights instead of hessian.  The subsequent per-node histogram is built upon
   this global sketch.  This is the fastest algorithm as it runs sketching only once.  The
   algorithm can be accessed by setting ``tree_method`` to ``hist``.

-4. ``gpu_hist`` tree method: The ``gpu_hist`` tree method is a GPU implementation of
+3. ``gpu_hist`` tree method: The ``gpu_hist`` tree method is a GPU implementation of
   ``hist``, with additional support for gradient based sampling.  The algorithm can be
   accessed by setting ``tree_method`` to ``gpu_hist``.

@@ -102,19 +93,32 @@ Other Updaters
 Removed Updaters
 ****************

-2 Updaters were removed during development due to maintainability.  We describe them here
-solely for the interest of documentation.  First one is distributed colmaker, which was a
-distributed version of exact tree method.  It required specialization for column based
-splitting strategy and a different prediction procedure.  As the exact tree method is slow
-by itself and scaling is even less efficient, we removed it entirely.  Second one is
-``skmaker``.  Per-node weighted sketching employed by ``grow_local_histmaker`` is slow,
-the ``skmaker`` was unmaintained and seems to be a workaround trying to eliminate the
-histogram creation step and uses sketching values directly during split evaluation.  It
-was never tested and contained some unknown bugs, we decided to remove it and focus our
-resources on more promising algorithms instead.  For accuracy, most of the time
-``approx``, ``hist`` and ``gpu_hist`` are enough with some parameters tuning, so removing
-them don't have any real practical impact.
+3 Updaters were removed during development due to maintainability.  We describe them here
+solely for the interest of documentation.

+1. Distributed colmaker, which was a distributed version of exact tree method.  It
+   required specialization for column based splitting strategy and a different prediction
+   procedure.  As the exact tree method is slow by itself and scaling is even less
+   efficient, we removed it entirely.
+
+2. ``skmaker``.  Per-node weighted sketching employed by ``grow_local_histmaker`` is slow,
+   the ``skmaker`` was unmaintained and seems to be a workaround trying to eliminate the
+   histogram creation step and uses sketching values directly during split evaluation.  It
+   was never tested and contained some unknown bugs, we decided to remove it and focus our
+   resources on more promising algorithms instead.  For accuracy, most of the time
+   ``approx``, ``hist`` and ``gpu_hist`` are enough with some parameters tuning, so
+   removing them don't have any real practical impact.
+
+3. ``grow_local_histmaker`` updater: An approximation tree method described in `reference
+   paper <http://arxiv.org/abs/1603.02754>`_.  This updater was rarely used in practice so
+   it was still an updater rather than tree method.  During split finding, it first runs a
+   weighted GK sketching for data points belong to current node to find split candidates,
+   using hessian as weights.  The histogram is built upon this per-node sketch.  It was
+   faster than ``exact`` in some applications, but still slow in computation.  It was
+   removed because it depended on Rabit's customized reduction function that handles all
+   the data structure that can be serialized/deserialized into fixed size buffer, which is
+   not directly supported by NCCL or federated learning gRPC, making it hard to refactor
+   into a common allreducer interface.

 **************
 Feature Matrix