* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* add dev script to update version and update versions
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* update resource files
* Update SparkParallelismTracker.scala
* remove xgboost-tracker.properties
* [jvm-packages] Train Booster from an existing model
* Align Scala API with Java API
* Existing model should not load rabit checkpoint
* Address minor comments
* Implement saving temporary boosters and loading previous booster
* Add more unit tests for loadPrevBooster
* Add params to XGBoostEstimator
* (1) Move repartition out of the temp model saving loop (2) Address CR comments
* Catch a corner case of training next model with fewer rounds
* Address comments
* Refactor newly added methods into TmpBoosterManager
* Add two files which is missing in previous commit
* Rename TmpBooster to checkpoint
* [jvm-packages] Fixed test/train persistence
Prior to this patch both data sets were persisted in the same directory,
i.e. the test data replaced the training one which led to
* training on less data (since usually test < train) and
* test loss being exactly equal to the training loss.
Closes#2945.
* Cleanup file cache after the training
* Addressed review comments
* [R] fix finding R.exe with cmake on WIN when it is in PATH
* [R] appveyor config for R package
* [R] wrap the lines to make R check happier
* [R] install only binary dep-packages in appveyor
* [R] for MSVC appveyor, also build a binary for R package and keep as an artifact
* [R] fix predict contributions for data with no colnames
* [R] add a render parameter for xgb.plot.multi.trees; fixes#2628
* [R] update Rd's
* [R] remove unnecessary dep-package from R cmake install
* silence type warnings; readability
* [R] silence complaint about incomplete line at the end
* [R] initial version of xgb.plot.shap()
* [R] more work on xgb.plot.shap
* [R] enforce black font in xgb.plot.tree; fixes#2640
* [R] if feature names are available, check in predict that they are the same; fixes#2857
* [R] cran check and lint fixes
* remove tabs
* [R] add references; a test for plot.shap
* Fix#2905
* Fix gpu_exact test failures
* Fix bug in GPU prediction where multiple calls to batch prediction can produce incorrect results
* Fix GPU documentation formatting
* Some minor changes to the code style
Some minor changes to the code style in file basic_walkthrough.py
* coding style changes
* coding style changes arrcording PEP8
* Update basic_walkthrough.py
* Fix minor typo
* Minor edits to coding style
Minor edits to coding style following the proposals of PEP8.
* [jvm-packages] Exposed train-time evaluation metrics
They are accessible via 'XGBoostModel.summary'. The summary is not
serialized with the model and is only available after the training.
* Addressed review comments
* Extracted model-related tests into 'XGBoostModelSuite'
* Added tests for copying the 'XGBoostModel'
* [jvm-packages] Fixed a subtle bug in train/test split
Iterator.partition (naturally) assumes that the predicate is deterministic
but this is not the case for
r.nextDouble() <= trainTestRatio
therefore sometimes the DMatrix(...) call got a NoSuchElementException
and crashed the JVM due to lack of exception handling in
XGBoost4jCallbackDataIterNext.
* Make sure train/test objectives are different
I found the installation of the Python XGBoost package to be problematic as the documentation around compiler requirements was unclear, as discussed in #1501. I decided that I would improve the README.
- Implement colsampling, subsampling for gpu_hist_experimental
- Optimised multi-GPU implementation for gpu_hist_experimental
- Make nccl optional
- Add Volta architecture flag
- Optimise RegLossObj
- Add timing utilities for debug verbose mode
- Bump required cuda version to 8.0
In the refactor to add base margins, #2532, all of the labels were lost
when creating the dmatrix. This became obvious as metrics like ndcg
always returned 1.0 regardless of the results.
Change-Id: I88be047e1c108afba4784bd3d892bfc9edeabe55
Training a model with the experimental rank:ndcg objective incorrectly
returns a Classification model. Adjust the classification check to
not recognize rank:* objectives as classification.
While writing tests for isClassificationTask also turned up that
obj_type -> regression was incorrectly identified as a classification
task so the function was slightly adjusted to pass the new tests.
* Some minor changes to the code style
Some minor changes to the code style in file basic_walkthrough.py
* coding style changes
* coding style changes arrcording PEP8
* Update basic_walkthrough.py
* Fatal error if GPU algorithm selected without GPU support compiled
* Resolve type conversion warnings
* Fix gpu unit test failure
* Fix compressed iterator edge case
* Fix python unit test failures due to flake8 update on pip
Problem:
Fast histogram updater crashes whenever subsampling picks zero rows
Diagnosis:
Row set data structure uses "nullptr" internally to indicate a non-existent
row set. Since you cannot take the address of the first element of an empty
vector, a valid row set ends up getting "nullptr" as well.
Fix:
Use an arbitrary value (not equal to "nullptr") to bypass nullptr check.
* Only set OpenMP_CXX_FLAGS when OpenMP is found
I found this trying to get the Mac build working without OpenMP. Tips in
issue #2596 helped to point in the right direction.
* Revise check
* Trigger codecov