xgboost

Files

Sergei Lebedev 69c3b78a29 [jvm-packages] Implemented early stopping (#2710 )

* Allowed subsampling test from the training data frame/RDD

The implementation requires storing 1 - trainTestRatio points in memory
to make the sampling work.

An alternative approach would be to construct the full DMatrix and then
slice it deterministically into train/test. The peak memory consumption
of such scenario, however, is twice the dataset size.

* Removed duplication from 'XGBoost.train'

Scala callers can (and should) use names to supply a subset of
parameters. Method overloading is not required.

* Reuse XGBoost seed parameter to stabilize train/test splitting

* Added early stopping support to non-distributed XGBoost

Closes #1544

* Added early-stopping to distributed XGBoost

* Moved construction of 'watches' into a separate method

This commit also fixes the handling of 'baseMargin' which previously
was not added to the validation matrix.

* Addressed review comments

2017-09-29 12:06:22 -07:00

src/main/scala/ml/dmlc/xgboost4j/scala/flink

[jvm-packages] Implemented early stopping (#2710 )

2017-09-29 12:06:22 -07:00

pom.xml

Removed 'flink.suffix' and added 'flink.version' (#2277 )

2017-05-10 08:42:40 -07:00