Go to file

Philip Cho 14fba01b5a Improve multi-threaded performance (#2104 )

* Add UpdatePredictionCache() option to updaters

Some updaters (e.g. fast_hist) has enough information to quickly compute
prediction cache for the training data. Each updater may override
UpdaterPredictionCache() method to update the prediction cache. Note: this
trick does not apply to validation data.

* Respond to code review

* Disable some debug messages by default
* Document UpdatePredictionCache() interface
* Remove base_margin logic from UpdatePredictionCache() implementation
* Do not take pointer to cfg, as reference may get stale

* Improve multi-threaded performance

* Use columnwise accessor to accelerate ApplySplit() step,
  with support for a compressed representation
* Parallel sort for evaluation step
* Inline BuildHist() function
* Cache gradient pairs when building histograms in BuildHist()

* Add missing #if macro

* Respond to code review

* Use wrapper to enable parallel sort on Linux

* Fix C++ compatibility issues

* MSVC doesn't support unsigned in OpenMP loops
* gcc 4.6 doesn't support using keyword

* Fix lint issues

* Respond to code review

* Fix bug in ApplySplitSparseData()

* Attempting to read beyond the end of a sparse column
* Mishandling the case where an entire range of rows have missing values

* Fix training continuation bug

Disable UpdatePredictionCache() in the first iteration. This way, we can
accomodate the scenario where we build off of an existing (nonempty) ensemble.

* Add regression test for fast_hist

* Respond to code review

* Add back old version of ApplySplitSparseData

2017-03-25 10:35:01 -07:00

amalgamation

Histogram Optimized Tree Grower (#1940 )

2017-01-13 09:25:55 -08:00

demo

Update md grammar for the README.md (#2141 )

2017-03-23 11:02:06 -07:00

dmlc-core @ 2b75a0ce6f

[UPDATE] Update rabit and threadlocal (#2114 )

2017-03-16 18:48:37 -07:00

doc

Formatting fixed for CLI parameters (#2145 )

2017-03-24 08:54:58 -07:00

include/xgboost

Improve multi-threaded performance (#2104 )

2017-03-25 10:35:01 -07:00

jvm-packages

[jvm-packages] call setGroup for ranking task (#2066 )

2017-03-06 15:45:06 -08:00

make

config.mk: Set TEST_COVER to 0 by default (#1853 )

2016-12-11 19:48:15 +01:00

plugin

Fix cmake build for linux. Update GPU benchmarks. (#1904 )

2016-12-23 09:18:56 +01:00

python-package

bugfix: when metric's name contains - (#2090 )

2017-03-16 10:36:39 -07:00

R-package

Typo Issue (#2100 )

2017-03-16 10:38:25 -07:00

rabit @ a764d45cfb

[UPDATE] Update rabit and threadlocal (#2114 )

2017-03-16 18:48:37 -07:00

src

Improve multi-threaded performance (#2104 )

2017-03-25 10:35:01 -07:00

tests

Improve multi-threaded performance (#2104 )

2017-03-25 10:35:01 -07:00

.gitignore

[jvm-packages] Scala/Java interface for Fast Histogram Algorithm (#1966 )

2017-03-04 15:37:24 -08:00

.gitmodules

[REFACTOR] cleanup structure

2016-01-16 10:24:00 -08:00

.travis.yml

new thread local requires xcode8

2017-03-17 09:40:34 -07:00

appveyor.yml

GPU plug-in improvements + basic Windows continuous integration (#1752 )

2016-11-10 12:34:09 -08:00

build.sh

Minor fix on installation guide and (the probably deprecated) build script

2016-02-24 12:46:37 +08:00

CMakeLists.txt

Fix cmake build for linux. Update GPU benchmarks. (#1904 )

2016-12-23 09:18:56 +01:00

CONTRIBUTORS.md

Use bst_float consistently throughout (#1824 )

2016-11-30 10:02:10 -08:00

ISSUE_TEMPLATE.md

issue template (#1475 )

2016-08-17 22:50:37 -07:00

LICENSE

update year in LICENSE, conf.py and README.md files

2016-03-15 16:51:34 +03:00

Makefile

ENH more makefile updates (#2133 )

2017-03-22 16:22:15 -05:00

NEWS.md

[UPDATE] Update rabit and threadlocal (#2114 )

2017-03-16 18:48:37 -07:00

README.md

change contribution link to open issues (#1834 )

2016-12-02 11:03:03 -08:00

README.md

eXtreme Gradient Boosting

Documentation | Resources | Installation | Release Notes | RoadMap

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

What's New

Ask a Question

For reporting bugs please use the xgboost/issues page.
For generic questions or to share your experience using XGBoost please use the XGBoost User Group

Help to Make XGBoost Better

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.

Check out call for contributions and Roadmap to see what can be improved, or open an issue if you want something.
Contribute to the documents and examples to share your experience with other users.
Add your stories and experience to Awesome XGBoost.
Please add your name to CONTRIBUTORS.md and after your patch has been merged.
- Please also update NEWS.md on changes and improvements in API and docs.

License

Reference

Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
XGBoost originates from research project at University of Washington, see also the Project Page at UW.

Languages

C++ 45.5%

Python 20.3%

Cuda 15.2%

R 6.8%

Scala 6.4%

Other 5.6%