2014-03-02 13:30:24 -08:00
2014-03-02 13:30:24 -08:00
2014-03-01 21:49:29 -08:00
2014-03-02 13:30:24 -08:00
2014-02-26 11:51:58 -08:00
2014-02-28 20:09:40 -08:00
2014-03-01 20:56:25 -08:00
2014-02-28 20:13:01 -08:00

xgboost: eXtreme Gradient Boosting

An efficient general purpose gradient boosting (tree) library.

Creater: Tianqi Chen

Documentation: https://github.com/tqchen/xgboost/wiki

Features

  • Sparse feature format:
    • Sparse feature format allows easy handling of missing values, and improve computation efficiency.
  • Push the limit on single machine:
    • Efficient implementation that optimizes memory and computation.
  • Layout of gradient boosting algorithm to support generic tasks, see project wiki.

Planned key components

  • Gradient boosting models:
    • regression tree (GBRT)
    • linear model/lasso
  • Objectives to support tasks:
    • regression
    • classification
    • ranking
    • matrix factorization
    • structured prediction (3) OpenMP implementation

File extension convention: (1) .h are interface, utils and data structures, with detailed comment; (2) .cpp are implementations that will be compiled, with less comment; (3) .hpp are implementations that will be included by .cpp, with less comment

Description
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Readme 33 MiB
Languages
C++ 45.5%
Python 20.3%
Cuda 15.2%
R 6.8%
Scala 6.4%
Other 5.6%