Support optimal partitioning for GPU hist. (#7652)

* Implement `MaxCategory` in quantile.
* Implement partition-based split for GPU evaluation.  Currently, it's based on the existing evaluation function.
* Extract an evaluator from GPU Hist to store the needed states.
* Added some CUDA stream/event utilities.
* Update document with references.
* Fixed a bug in approx evaluator where the number of data points is less than the number of categories.
This commit is contained in:
Jiaming Yuan
2022-02-15 03:03:12 +08:00
committed by GitHub
parent 2369d55e9a
commit 0d0abe1845
26 changed files with 1088 additions and 528 deletions

View File

@@ -245,8 +245,8 @@ Additional parameters for ``hist``, ``gpu_hist`` and ``approx`` tree method
- Use single precision to build histograms instead of double precision.
Additional parameters for ``approx`` tree method
================================================
Additional parameters for ``approx`` and ``gpu_hist`` tree method
=================================================================
* ``max_cat_to_onehot``
@@ -257,7 +257,8 @@ Additional parameters for ``approx`` tree method
- A threshold for deciding whether XGBoost should use one-hot encoding based split for
categorical data. When number of categories is lesser than the threshold then one-hot
encoding is chosen, otherwise the categories will be partitioned into children nodes.
Only relevant for regression and binary classification with `approx` tree method.
Only relevant for regression and binary classification. Also, `approx` or `gpu_hist`
tree method is required.
Additional parameters for Dart Booster (``booster=dart``)
=========================================================