e81a11d Merge pull request #25 from daiyl0320/master 35c3b37 add retry mechanism to ConnectTracker and modify Listen backlog to 128 in rabit_traker.py c71ed6f try deply doxygen 62e5647 try deply doxygen 732f1c6 try 2fa6e02 ok 0537665 minor 7b59dcb minor 5934950 new doc f538187 ok 44b6049 new doc 387339b add more 9d4397a chg 2879a48 chg 30e3110 ok 9ff0301 add link translation 6b629c2 k 32e1955 ok 8f4839d fix 93137b2 ok 7eeeb79 reload recommonmark a8f00cc minor 19b0f01 ok dd01184 minor c1cdc19 minor fcf0f43 try rst cbc21ae try 62ddfa7 tiny aefc05c final change 2aee9b4 minor fe4e7c2 ok 8001983 change to subtitle 5ca33e4 ok 88f7d24 update guide 29d43ab add code fe8bb3b minor hack for readthedocs 229c71d Merge branch 'master' of ssh://github.com/dmlc/rabit 7424218 ok d1d45bb Update README.md 1e8813f Update README.md 1ccc990 Update README.md 0323e06 remove readme 679a835 remove theme 7ea5b7c remove numpydoc to napoleon b73e2be Merge branch 'master' of ssh://github.com/dmlc/rabit 1742283 ok 1838e25 Update python-requirements.txt bc4e957 ok fba6fc2 ok 0251101 ok d50b905 ok d4f2509 ok cdf401a ok fef0ef2 new doc cef360d ok c125d2a ok 270a49e add requirments 744f901 get the basic doc 1cb5cad Merge branch 'master' of ssh://github.com/dmlc/rabit 8cc07ba minor d74f126 Update .travis.yml 52b3dcd Update .travis.yml 099581b Update .travis.yml 1258046 Update .travis.yml 7addac9 Update Makefile 0ea7adf Update .travis.yml f858856 Update travis_script.sh d8eac4a Update README.md 3cc49ad lint and travis ceedf4e fix fd8920c fix win32 8bbed35 modify 9520b90 Merge pull request #14 from dmlc/hjk41 df14bb1 fix type f441dc7 replace tab with blankspace 2467942 remove unnecessary include 181ef47 defined long long and ulonglong 1582180 use int32_t to define int and int64_t to define long. in VC long is 32bit e0b7da0 fix git-subtree-dir: subtree/rabit git-subtree-split: e81a11dd7ee3cff87a38a42901315821df018bae
5.2 KiB
5.2 KiB
eXtreme Gradient Boosting
An optimized general purpose gradient boosting library. The library is parallelized, and also provides an optimized distributed version.
It implements machine learning algorithms under the Gradient Boosting framework, including Generalized Linear Model (GLM) and Gradient Boosted Decision Trees (GBDT). XGBoost can also be distributed and scale to Terascale data
XGBoost is part of Distributed Machine Learning Common <img src=https://avatars2.githubusercontent.com/u/11508361?v=3&s=20> projects
Contents
- What's New
- Version
- Documentation
- Build Instruction
- Features
- Distributed XGBoost
- Usecases
- Bug Reporting
- Contributing to XGBoost
- Committers and Contributors
- License
- XGBoost in Graphlab Create
What's New
- XGBoost helps Owen Zhang to win the Avito Context Ad Click competition. Check out the interview from Kaggle.
- XGBoost helps Chenglong Chen to win Kaggle CrowdFlower Competition Check out the winning solution
- XGBoost-0.4 release, see CHANGES.md
- XGBoost helps three champion teams to win WWW2015 Microsoft Malware Classification Challenge (BIG 2015) Check out the winning solution
- External Memory Version
Version
- Current version xgboost-0.4
- Change log
- This version is compatible with 0.3x versions
Features
- Easily accessible through CLI, python, R, Julia
- Its fast! Benchmark numbers comparing xgboost, H20, Spark, R - benchm-ml numbers
- Memory efficient - Handles sparse matrices, supports external memory
- Accurate prediction, and used extensively by data scientists and kagglers - highlight links
- Distributed version runs on Hadoop (YARN), MPI, SGE etc., scales to billions of examples.
Bug Reporting
- For reporting bugs please use the xgboost/issues page.
- For generic questions or to share your experience using xgboost please use the XGBoost User Group
Contributing to XGBoost
XGBoost has been developed and used by a group of active community members. Everyone is more than welcome to contribute. It is a way to make the project better and more accessible to more users.
- Check out Feature Wish List to see what can be improved, or open an issue if you want something.
- Contribute to the documents and examples to share your experience with other users.
- Please add your name to CONTRIBUTORS.md after your patch has been merged.
License
© Contributors, 2015. Licensed under an Apache-2 license.
XGBoost in Graphlab Create
- XGBoost is adopted as part of boosted tree toolkit in Graphlab Create (GLC). Graphlab Create is a powerful python toolkit that allows you to do data manipulation, graph processing, hyper-parameter search, and visualization of TeraBytes scale data in one framework. Try the Graphlab Create
- Nice blogpost by Jay Gu about using GLC boosted tree to solve kaggle bike sharing challenge: