xgboost/demo/distributed-training
AbdealiJK 6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
..
2016-11-25 16:34:57 -05:00

Distributed XGBoost Training

This is an tutorial of Distributed XGBoost Training. Currently xgboost supports distributed training via CLI program with the configuration file. There is also plan push distributed python and other language bindings, please open an issue if you are interested in contributing.

Build XGBoost with Distributed Filesystem Support

To use distributed xgboost, you only need to turn the options on to build with distributed filesystems(HDFS or S3) in xgboost/make/config.mk.

Step by Step Tutorial on AWS

Checkout this tutorial for running distributed xgboost.

Model Analysis

XGBoost is exchangeable across all bindings and platforms. This means you can use python or R to analyze the learnt model and do prediction. For example, you can use the plot_model.ipynb to visualize the learnt model.