Hendrik/xgboost

Go to file

tqchen c4acb4fe01 check in io module

2014-08-16 14:06:31 -07:00

support for multiclass output prob

2014-08-01 11:21:17 -07:00

add rand seeds back

2014-05-25 10:18:04 -07:00

check in io module

2014-08-16 14:06:31 -07:00

fix combine buffer

2014-05-25 16:46:03 -07:00

.gitignore

add ignore

2014-05-16 20:46:08 -07:00

LICENSE

fix numpy convert

2014-05-15 20:28:34 -07:00

Makefile

check in io module

2014-08-16 14:06:31 -07:00

README.md

check in io module

2014-08-16 14:06:31 -07:00

README.md

xgboost: eXtreme Gradient Boosting

An optimized general purpose gradient boosting (tree) library.

Contributors: https://github.com/tqchen/xgboost/graphs/contributors

Turorial and Documentation: https://github.com/tqchen/xgboost/wiki

Questions and Issues: https://github.com/tqchen/xgboost/issues

xgboost-unity

experimental branch(not usable yet): refactor xgboost, cleaner code, more flexibility

Build

Simply type make
If your compiler does not come with OpenMP support, it will fire an warning telling you that the code will compile into single thread mode, and you will get single thread xgboost
You may get a error: -lgomp is not found, you can remove -fopenmp flag in Makefile to get single thread xgboost, or upgrade your compiler to compile multi-thread version

Project Logical Layout

Dependency order: io->learner->gbm->tree
- All module depends on data.h
tree are implementations of tree construction algorithms.
gbm is gradient boosting interface, that takes trees and other base learner to do boosting.
- gbm only takes gradient as sufficient statistics, it does not compute the gradient.
learner is learning module that computes gradient for specific object, and pass it to GBM

File Naming Convention

The project is templatized, to make it easy to adjust input data structure.
.h files are data structures and interface, which are needed to use functions in that layer.
-inl.hpp files are implementations of interface, like cpp file in most project.
- You only need to understand the interface file to understand the usage of that layer

Languages

C++ 45.5%

Python 20.3%

Cuda 15.2%

R 6.8%

Scala 6.4%

Other 5.6%