[jvm-packages]support multiple validation datasets in Spark (#3910)

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* wrap iterators

* enable copartition training and validationset

* add parameters

* converge code path and have init unit test

* enable multi evals for ranking

* unit test and doc

* update example

* fix early stopping

* address the offline comments

* udpate doc

* test eval metrics

* fix compilation issue

* fix example
This commit is contained in:
Nan Zhu
2018-12-17 21:03:57 -08:00
committed by GitHub
parent c8c7b9649c
commit c055a32609
14 changed files with 477 additions and 136 deletions

View File

@@ -200,6 +200,11 @@ In additional to ``num_early_stopping_rounds``, you also need to define ``maximi
After specifying these two parameters, the training would stop when the metrics goes to the other direction against the one specified by ``maximize_evaluation_metrics`` for ``num_early_stopping_rounds`` iterations.
Training with Evaluation Sets
----------------
You can also monitor the performance of the model during training with multiple evaluation datasets. By specifying ``eval_sets`` or call ``setEvalSets`` over a XGBoostClassifier or XGBoostRegressor, you can pass in multiple evaluation datasets typed as a Map from String to DataFrame.
Prediction
==========