Go to file

Jiaming Yuan 7663de956c Run training with empty DMatrix. (#4990 )

This makes GPU Hist robust in distributed environment as some workers might not
be associated with any data in either training or evaluation.

* Disable rabit mock test for now: See #5012 .

* Disable dask-cudf test at prediction for now: See #5003

* Launch dask job for all workers despite they might not have any data.
* Check 0 rows in elementwise evaluation metrics.

   Using AUC and AUC-PR still throws an error.  See #4663 for a robust fix.

* Add tests for edge cases.
* Add `LaunchKernel` wrapper handling zero sized grid.
* Move some parts of allreducer into a cu file.
* Don't validate feature names when the booster is empty.

* Sync number of columns in DMatrix.

  As num_feature is required to be the same across all workers in data split
  mode.

* Filtering in dask interface now by default syncs all booster that's not
empty, instead of using rank 0.

* Fix Jenkins' GPU tests.

* Install dask-cuda from source in Jenkins' test.

  Now all tests are actually running.

* Restore GPU Hist tree synchronization test.

* Check UUID of running devices.

  The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field.

* Fix CMake policy and project variables.

  Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13.

* Fix copying data to CPU

* Fix race condition in cpu predictor.

* Fix duplicated DMatrix construction.

* Don't download extra nccl in CI script.

2019-11-06 16:13:13 +08:00

.github

Enable auto-locking of issues closed long ago (#3821 )

2018-10-23 19:21:58 -07:00

amalgamation

Write ELLPACK pages to disk (#4879 )

2019-10-22 23:44:32 -04:00

cmake

Run training with empty DMatrix. (#4990 )

2019-11-06 16:13:13 +08:00

cub @ b20808b1b0

Update cub submodule again (fixes GPU build) (#2599 )

2017-08-13 22:14:40 +12:00

demo

Don't set_params at the end of set_state. (#4947 )

2019-10-15 10:08:26 -04:00

dev

[RFC] Version 0.90 release candidate (#4475 )

2019-05-20 01:02:44 -07:00

dmlc-core @ 20676bbc41

Update dmlc-core. (#4924 )

2019-10-09 23:16:45 -04:00

doc

Document minimum version required for gtest [skip ci] (#5001 )

2019-10-31 15:47:50 -07:00

include/xgboost

Use `UpdateAllowUnknown' for non-model related parameter. (#4961 )

2019-10-23 05:50:12 -04:00

jvm-packages

[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4966 )

2019-11-01 14:21:19 -07:00

make

Remove plugin, cuda related code in automake & autoconf files (#4789 )

2019-08-18 16:54:34 -04:00

plugin

Use `UpdateAllowUnknown' for non-model related parameter. (#4961 )

2019-10-23 05:50:12 -04:00

python-package

Run training with empty DMatrix. (#4990 )

2019-11-06 16:13:13 +08:00

R-package

Deprecate set group (#4864 )

2019-09-17 21:26:54 -04:00

rabit @ 2f25347168

[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4966 )

2019-11-01 14:21:19 -07:00

src

Run training with empty DMatrix. (#4990 )

2019-11-06 16:13:13 +08:00

tests

Run training with empty DMatrix. (#4990 )

2019-11-06 16:13:13 +08:00

.clang-tidy

Fix CPU hist init for sparse dataset. (#4625 )

2019-07-04 16:27:03 -07:00

.editorconfig

Added configuration for python into .editorconfig (#3494 )

2018-07-23 00:24:10 -07:00

.gitignore

ignore vscode and clion files (#4866 )

2019-09-17 21:27:40 -04:00

.gitmodules

Upgrading to NCCL2 (#3404 )

2018-07-10 00:42:15 -07:00

.travis.yml

[rabit_bootstrap_cache ] failed xgb worker recover from other workers (#4808 )

2019-09-16 23:31:52 -04:00

appveyor.yml

Remove VC-2013 support. (#4701 )

2019-07-25 01:28:51 -04:00

CITATION

simplify software citation (#2912 )

2017-12-01 02:58:13 -08:00

CMakeLists.txt

Run training with empty DMatrix. (#4990 )

2019-11-06 16:13:13 +08:00

CONTRIBUTORS.md

add os.PathLike support for file paths to DMatrix and Booster Python classes (#4757 )

2019-08-15 04:46:25 -04:00

Jenkinsfile

Run training with empty DMatrix. (#4990 )

2019-11-06 16:13:13 +08:00

Jenkinsfile-win64

[CI] Upload master branch artifacts to S3 root [skip ci] (#4979 )

2019-10-23 22:39:04 -07:00

LICENSE

fixed year to 2019 in conf.py, helpers.h and LICENSE (#4661 )

2019-07-15 12:29:12 -04:00

Makefile

Remove plugin, cuda related code in automake & autoconf files (#4789 )

2019-08-18 16:54:34 -04:00

NEWS.md

[RFC] Version 0.90 release candidate (#4475 )

2019-05-20 01:02:44 -07:00

README.md

Mention dask in readme. [skip ci] (#4942 )

2019-10-14 03:44:08 -04:00

README.md

eXtreme Gradient Boosting

Community | Documentation | Resources | Contributors | Release Notes

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Kubernetes, Hadoop, SGE, MPI, Dask) and can solve problems beyond billions of examples.

License

Contribute to XGBoost

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone. Checkout the Community Page

Reference

Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
XGBoost originates from research project at University of Washington.