xgboost/README.md

xgboost: eXtreme Gradient Boosting
=======
An optimized general purpose gradient boosting (tree) library.

Contributors:
* Tianqi Chen, project creater
* Kailong Chen, contributes regression module
* Bing Xu, contributes python interface, higgs example

Turorial and Documentation: https://github.com/tqchen/xgboost/wiki

Features
=======
* Sparse feature format:
  - Sparse feature format allows easy handling of missing values, and improve computation efficiency.
* Push the limit on single machine:
  - Efficient implementation that optimizes memory and computation.
* Speed: XGBoost is very fast
  - IN [demo/higgs/speedtest.py](demo/kaggle-higgs/speedtest.py), kaggle higgs data it is faster(on our machine 20 times faster using 4 threads) than sklearn.ensemble.GradientBoostingClassifier
* Layout of gradient boosting algorithm to support user defined objective
* Python interface, works with numpy and scipy.sparse matrix

Supported key components
=======
* Gradient boosting models:
    - regression tree (GBRT)
    - linear model/lasso
* Objectives to support tasks:
    - regression
    - classification
* OpenMP implementation

Planned components
=======
* More objective to support tasks:
    - ranking
    - matrix factorization
    - structured prediction

File extension convention
=======
* .h are interface, utils and data structures, with detailed comment;
* .cpp are implementations that will be compiled, with less comment;
* .hpp are implementations that will be included by .cpp, with less comment