tqchen c3592dc06c Merge branch 'master' of ssh://github.com/tqchen/xgboost
Conflicts:
	regression/xgboost_reg_data.h
2014-04-18 17:46:44 -07:00
2014-04-07 23:25:35 +08:00
2014-04-10 22:09:19 +08:00
2014-04-10 22:09:19 +08:00
2014-04-07 16:25:21 -07:00
2014-02-26 11:51:58 -08:00
2014-02-28 20:09:40 -08:00
2014-04-10 22:11:15 +08:00
2014-03-26 16:25:44 -07:00

xgboost: eXtreme Gradient Boosting

A General purpose gradient boosting (tree) library.

Authors:

  • Tianqi Chen, project creater
  • Kailong Chen, contributes regression module

Turorial and Documentation: https://github.com/tqchen/xgboost/wiki

Features

  • Sparse feature format:
    • Sparse feature format allows easy handling of missing values, and improve computation efficiency.
  • Push the limit on single machine:
    • Efficient implementation that optimizes memory and computation.
  • Layout of gradient boosting algorithm to support generic tasks, see project wiki.

Supported key components

  • Gradient boosting models:
    • regression tree (GBRT)
    • linear model/lasso
  • Objectives to support tasks:
    • regression
    • classification
  • OpenMP implementation

Planned components

  • More objective to support tasks:
    • ranking
    • matrix factorization
    • structured prediction

File extension convention

  • .h are interface, utils and data structures, with detailed comment;
  • .cpp are implementations that will be compiled, with less comment;
  • .hpp are implementations that will be included by .cpp, with less comment
Description
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Readme 33 MiB
Languages
C++ 45.5%
Python 20.3%
Cuda 15.2%
R 6.8%
Scala 6.4%
Other 5.6%