Update documents and tests. (#7659)
* Revise documents after recent refactoring and cat support. * Add tests for behavior of max_depth and max_leaves.
This commit is contained in:
@@ -74,8 +74,8 @@ Parameters for Tree Booster
|
||||
|
||||
* ``max_depth`` [default=6]
|
||||
|
||||
- Maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit. 0 is only accepted in ``lossguide`` growing policy when ``tree_method`` is set as ``hist`` or ``gpu_hist`` and it indicates no limit on depth. Beware that XGBoost aggressively consumes memory when training a deep tree.
|
||||
- range: [0,∞] (0 is only accepted in ``lossguide`` growing policy when ``tree_method`` is set as ``hist`` or ``gpu_hist``)
|
||||
- Maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit. 0 indicates no limit on depth. Beware that XGBoost aggressively consumes memory when training a deep tree. ``exact`` tree method requires non-zero value.
|
||||
- range: [0,∞]
|
||||
|
||||
* ``min_child_weight`` [default=1]
|
||||
|
||||
@@ -164,7 +164,7 @@ Parameters for Tree Booster
|
||||
|
||||
- Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: ``sum(negative instances) / sum(positive instances)``. See :doc:`Parameters Tuning </tutorials/param_tuning>` for more discussion. Also, see Higgs Kaggle competition demo for examples: `R <https://github.com/dmlc/xgboost/blob/master/demo/kaggle-higgs/higgs-train.R>`_, `py1 <https://github.com/dmlc/xgboost/blob/master/demo/kaggle-higgs/higgs-numpy.py>`_, `py2 <https://github.com/dmlc/xgboost/blob/master/demo/kaggle-higgs/higgs-cv.py>`_, `py3 <https://github.com/dmlc/xgboost/blob/master/demo/guide-python/cross_validation.py>`_.
|
||||
|
||||
* ``updater`` [default= ``grow_colmaker,prune``]
|
||||
* ``updater``
|
||||
|
||||
- A comma separated string defining the sequence of tree updaters to run, providing a modular way to construct and to modify the trees. This is an advanced parameter that is usually set automatically, depending on some other parameters. However, it could be also set explicitly by a user. The following updaters exist:
|
||||
|
||||
@@ -177,8 +177,6 @@ Parameters for Tree Booster
|
||||
- ``refresh``: refreshes tree's statistics and/or leaf values based on the current data. Note that no random subsampling of data rows is performed.
|
||||
- ``prune``: prunes the splits where loss < min_split_loss (or gamma) and nodes that have depth greater than ``max_depth``.
|
||||
|
||||
- In a distributed setting, the implicit updater sequence value would be adjusted to ``grow_histmaker,prune`` by default, and you can set ``tree_method`` as ``hist`` to use ``grow_histmaker``.
|
||||
|
||||
* ``refresh_leaf`` [default=1]
|
||||
|
||||
- This is a parameter of the ``refresh`` updater. When this flag is 1, tree leafs as well as tree nodes' stats are updated. When it is 0, only node stats are updated.
|
||||
@@ -194,7 +192,7 @@ Parameters for Tree Booster
|
||||
* ``grow_policy`` [default= ``depthwise``]
|
||||
|
||||
- Controls a way new nodes are added to the tree.
|
||||
- Currently supported only if ``tree_method`` is set to ``hist`` or ``gpu_hist``.
|
||||
- Currently supported only if ``tree_method`` is set to ``hist``, ``approx`` or ``gpu_hist``.
|
||||
- Choices: ``depthwise``, ``lossguide``
|
||||
|
||||
- ``depthwise``: split at nodes closest to the root.
|
||||
@@ -202,11 +200,11 @@ Parameters for Tree Booster
|
||||
|
||||
* ``max_leaves`` [default=0]
|
||||
|
||||
- Maximum number of nodes to be added. Only relevant when ``grow_policy=lossguide`` is set.
|
||||
- Maximum number of nodes to be added. Not used by ``exact`` tree method.
|
||||
|
||||
* ``max_bin``, [default=256]
|
||||
|
||||
- Only used if ``tree_method`` is set to ``hist`` or ``gpu_hist``.
|
||||
- Only used if ``tree_method`` is set to ``hist``, ``approx`` or ``gpu_hist``.
|
||||
- Maximum number of discrete bins to bucket continuous features.
|
||||
- Increasing this number improves the optimality of splits at the cost of higher computation time.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user