Go to file

Philip Cho 2715baef64 Fix bugs in multithreaded ApplySplitSparseData() (#2161 )

* Bugfix 1: Fix segfault in multithreaded ApplySplitSparseData()

When there are more threads than rows in rowset, some threads end up
with empty ranges, causing them to crash. (iend - 1 needs to be
accessible as part of algorithm)

Fix: run only those threads with nonempty ranges.

* Add regression test for Bugfix 1

* Moving python_omp_test to existing python test group

Turns out you don't need to set "OMP_NUM_THREADS" to enable
multithreading. Just add nthread parameter.

* Bugfix 2: Fix corner case of ApplySplitSparseData() for categorical feature

When split value is less than all cut points, split_cond is set
incorrectly.

Fix: set split_cond = -1 to indicate this scenario

* Bugfix 3: Initialize data layout indicator before using it

data_layout_ is accessed before being set; this variable determines
whether feature 0 is included in feat_set.

Fix: re-order code in InitData() to initialize data_layout_ first

* Adding regression test for Bugfix 2

Unfortunately, no regression test for Bugfix 3, as there is no
way to deterministically assign value to an uninitialized variable.

2017-04-02 11:37:39 -07:00

amalgamation

Histogram Optimized Tree Grower (#1940 )

2017-01-13 09:25:55 -08:00

demo

Nonreproducible sequence of evaluations fixed (#2153 )

2017-03-29 10:11:23 -07:00

dmlc-core @ b5bec5481d

Remove xgboost's thread_local and switch to dmlc::ThreadLocalStore (#2121 )

2017-03-27 09:09:18 -07:00

doc

Formatting fixed for CLI parameters (#2145 )

2017-03-24 08:54:58 -07:00

include/xgboost

Fix bugs in multithreaded ApplySplitSparseData() (#2161 )

2017-04-02 11:37:39 -07:00

jvm-packages

[jvm-packages] call setGroup for ranking task (#2066 )

2017-03-06 15:45:06 -08:00

make

config.mk: Set TEST_COVER to 0 by default (#1853 )

2016-12-11 19:48:15 +01:00

plugin

GPU Plugin: Bug fix #2048 (#2155 )

2017-03-29 10:10:57 -07:00

python-package

bugfix: when metric's name contains - (#2090 )

2017-03-16 10:36:39 -07:00

R-package

Typo Issue (#2100 )

2017-03-16 10:38:25 -07:00

rabit @ a764d45cfb

[UPDATE] Update rabit and threadlocal (#2114 )

2017-03-16 18:48:37 -07:00

src

Fix bugs in multithreaded ApplySplitSparseData() (#2161 )

2017-04-02 11:37:39 -07:00

tests

Fix bugs in multithreaded ApplySplitSparseData() (#2161 )

2017-04-02 11:37:39 -07:00

.gitignore

[jvm-packages] Scala/Java interface for Fast Histogram Algorithm (#1966 )

2017-03-04 15:37:24 -08:00

.gitmodules

[REFACTOR] cleanup structure

2016-01-16 10:24:00 -08:00

.travis.yml

new thread local requires xcode8

2017-03-17 09:40:34 -07:00

appveyor.yml

GPU plug-in improvements + basic Windows continuous integration (#1752 )

2016-11-10 12:34:09 -08:00

build.sh

Minor fix on installation guide and (the probably deprecated) build script

2016-02-24 12:46:37 +08:00

CMakeLists.txt

Fix cmake build for linux. Update GPU benchmarks. (#1904 )

2016-12-23 09:18:56 +01:00

CONTRIBUTORS.md

Use bst_float consistently throughout (#1824 )

2016-11-30 10:02:10 -08:00

ISSUE_TEMPLATE.md

issue template (#1475 )

2016-08-17 22:50:37 -07:00

LICENSE

update year in LICENSE, conf.py and README.md files

2016-03-15 16:51:34 +03:00

Makefile

ENH more makefile updates (#2133 )

2017-03-22 16:22:15 -05:00

NEWS.md

[UPDATE] Update rabit and threadlocal (#2114 )

2017-03-16 18:48:37 -07:00

README.md

change contribution link to open issues (#1834 )

2016-12-02 11:03:03 -08:00

README.md

eXtreme Gradient Boosting

Documentation | Resources | Installation | Release Notes | RoadMap

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

What's New

Ask a Question

For reporting bugs please use the xgboost/issues page.
For generic questions or to share your experience using XGBoost please use the XGBoost User Group

Help to Make XGBoost Better

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.

Check out call for contributions and Roadmap to see what can be improved, or open an issue if you want something.
Contribute to the documents and examples to share your experience with other users.
Add your stories and experience to Awesome XGBoost.
Please add your name to CONTRIBUTORS.md and after your patch has been merged.
- Please also update NEWS.md on changes and improvements in API and docs.

License

Reference

Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
XGBoost originates from research project at University of Washington, see also the Project Page at UW.

Languages

C++ 45.5%

Python 20.3%

Cuda 15.2%

R 6.8%

Scala 6.4%

Other 5.6%