Philip Cho ba820847f9 Patch to improve multithreaded performance scaling (#2493)
* Patch to improve multithreaded performance scaling

Change parallel strategy for histogram construction.
Instead of partitioning data rows among multiple threads, partition feature
columns instead. Useful heuristics for assigning partitions have been adopted
from LightGBM project.

* Add missing header to satisfy MSVC

* Restore max_bin and related parameters to TrainParam

* Fix lint error

* inline functions do not require static keyword

* Feature grouping algorithm accepting FastHistParam

Feature grouping algorithm accepts many parameters (3+), and it gets annoying to
pass them one by one. Instead, simply pass the reference to FastHistParam. The
definition of FastHistParam has been moved to a separate header file to
accomodate this change.
2017-07-07 08:25:07 -07:00
2017-07-06 18:05:11 +12:00
2017-07-06 20:05:09 -04:00
2017-07-06 18:05:11 +12:00
2017-07-07 12:36:26 +12:00
2017-07-06 18:05:11 +12:00
2017-07-06 18:05:11 +12:00
2017-07-06 18:05:11 +12:00
2017-06-25 22:32:11 -04:00
2017-07-07 12:36:26 +12:00
2017-05-27 08:38:32 -07:00
2017-05-23 21:47:53 -05:00
2017-04-25 16:37:10 -07:00

eXtreme Gradient Boosting

Build Status Build Status Documentation Status GitHub license CRAN Status Badge PyPI version Gitter chat for developers at https://gitter.im/dmlc/xgboost

Documentation | Resources | Installation | Release Notes | RoadMap

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

What's New

Ask a Question

Help to Make XGBoost Better

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.

License

© Contributors, 2016. Licensed under an Apache-2 license.

Reference

Description
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Readme 33 MiB
Languages
C++ 45.5%
Python 20.3%
Cuda 15.2%
R 6.8%
Scala 6.4%
Other 5.6%