The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.
This commit is contained in:
@@ -142,9 +142,7 @@ def _train_internal(params, dtrain,
|
||||
)
|
||||
else:
|
||||
raise ValueError(f'Unknown booster: {booster}')
|
||||
num_groups = int(config['learner']['learner_model_param']['num_class'])
|
||||
num_groups = 1 if num_groups == 0 else num_groups
|
||||
bst.best_ntree_limit = (bst.best_iteration + 1) * num_parallel_tree * num_groups
|
||||
bst.best_ntree_limit = (bst.best_iteration + 1) * num_parallel_tree
|
||||
|
||||
# Copy to serialise and unserialise booster to reset state and free
|
||||
# training memory
|
||||
@@ -184,9 +182,10 @@ def train(params, dtrain, num_boost_round=10, evals=(), obj=None, feval=None,
|
||||
If there's more than one metric in the **eval_metric** parameter given in
|
||||
**params**, the last metric will be used for early stopping.
|
||||
If early stopping occurs, the model will have three additional fields:
|
||||
``bst.best_score``, ``bst.best_iteration`` and ``bst.best_ntree_limit``.
|
||||
(Use ``bst.best_ntree_limit`` to get the correct value if
|
||||
``num_parallel_tree`` and/or ``num_class`` appears in the parameters)
|
||||
``bst.best_score``, ``bst.best_iteration`` and ``bst.best_ntree_limit``. Use
|
||||
``bst.best_ntree_limit`` to get the correct value if ``num_parallel_tree`` and/or
|
||||
``num_class`` appears in the parameters. ``best_ntree_limit`` is the result of
|
||||
``num_parallel_tree * best_iteration``.
|
||||
evals_result: dict
|
||||
This dictionary stores the evaluation results of all the items in watchlist.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user