update note

This commit is contained in:
tqchen 2015-01-19 08:34:35 -08:00
parent e5c609271f
commit f0a412d224
3 changed files with 19 additions and 4 deletions

View File

@ -20,3 +20,9 @@ xgboost-0.3
* Linear booster is now parallelized, using parallel coordinated descent.
* Add [Code Guide](src/README.md) for customizing objective function and evaluation
* Add R module
in progress version
=====
* Distributed version
* Feature importance visualization in R module, thanks to Michael Benesty
* Predict leaf inde

View File

@ -1,6 +1,7 @@
xgboost: eXtreme Gradient Boosting
======
An optimized general purpose gradient boosting library. The library is parallelized using OpenMP. It implements machine learning algorithm under gradient boosting framework, including generalized linear model and gradient boosted regression tree.
An optimized general purpose gradient boosting library. The library is parallelized, and also provides an optimized distributed version.
It implements machine learning algorithm under gradient boosting framework, including generalized linear model and gradient boosted regression tree.
Contributors: https://github.com/tqchen/xgboost/graphs/contributors
@ -10,6 +11,8 @@ Questions and Issues: [https://github.com/tqchen/xgboost/issues](https://github.
Examples Code: [Learning to use xgboost by examples](demo)
Distributed Version: [Distributed XGBoost](multi-node)
Notes on the Code: [Code Guide](src)
Learning about the model: [Introduction to Boosted Trees](http://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf)
@ -19,11 +22,14 @@ Learning about the model: [Introduction to Boosted Trees](http://homes.cs.washin
What's New
=====
* [Distributed XGBoost](multi-node) is now available to scale to even larger scale problems
* [Distributed XGBoost](multi-node) is now available!!
* New features in the lastest changes :)
- Distributed version that scale xgboost to even larger problems with cluster
- Feature importance visualization in R module, thanks to Michael Benesty
- Predict leaf index, see [demo/guide-python/pred_leaf_indices.py](demo/guide-python/pred_leaf_indices.py)
* XGBoost wins [Tradeshift Text Classification](https://kaggle2.blob.core.windows.net/forum-message-attachments/60041/1813/TradeshiftTextClassification.pdf?sv=2012-02-12&se=2015-01-02T13%3A55%3A16Z&sr=b&sp=r&sig=5MHvyjCLESLexYcvbSRFumGQXCS7MVmfdBIY3y01tMk%3D)
* XGBoost wins [HEP meets ML Award in Higgs Boson Challenge](http://atlas.ch/news/2014/machine-learning-wins-the-higgs-challenge.html)
* Thanks to Bing Xu, [XGBoost.jl](https://github.com/antinucleon/XGBoost.jl) allows you to use xgboost from Julia
* See the updated [demo folder](demo) for feature walkthrough
* Thanks to Tong He, the new [R package](R-package) is available
Features
@ -35,6 +41,9 @@ Features
* Speed: XGBoost is very fast
- IN [demo/higgs/speedtest.py](demo/kaggle-higgs/speedtest.py), kaggle higgs data it is faster(on our machine 20 times faster using 4 threads) than sklearn.ensemble.GradientBoostingClassifier
* Layout of gradient boosting algorithm to support user defined objective
* Distributed and portable
- The distributed version of xgboost is highly portable and can be used in different platforms
- It inheritates all the optimizations made in single machine mode, maximumly utilize the resources using both multi-threading and distributed computing.
Build
=====

View File

@ -23,7 +23,7 @@ Notes
* The multi-threading nature of xgboost is inheritated in distributed mode
- This means xgboost efficiently use all the threads in one machine, and communicates only between machines
- Remember to run on xgboost process per machine and this will give you maximum speedup
* For more information about rabit and how it works, see the [tutorial](https://github.com/tqchen/rabit/tree/master/guide)
* For more information about rabit and how it works, see the [Rabit's Tutorial](https://github.com/tqchen/rabit/tree/master/guide)
Solvers
=====