[GPU Plugin] Fast histogram speed improvements. Updated benchmarks. (#2258)

This commit is contained in:
Rory Mitchell
2017-05-09 04:21:38 +12:00
committed by Tianqi Chen
parent 98ea461532
commit 6bf968efe6
12 changed files with 433 additions and 241 deletions

View File

@@ -2,10 +2,21 @@
This plugin adds GPU accelerated tree construction algorithms to XGBoost.
## Usage
Specify the 'updater' parameter as one of the following algorithms.
updater | Description
--- | ---
grow_gpu | The standard XGBoost tree construction algorithm. Performs exact search for splits. Slower and uses considerably more memory than 'grow_gpu_hist'
grow_gpu_hist | Equivalent to the XGBoost fast histogram algorithm. Faster and uses considerably less memory. Splits may be less accurate.
### Algorithms
| updater | Description |
| --- | --- |
grow_gpu | The standard XGBoost tree construction algorithm. Performs exact search for splits. Slower and uses considerably more memory than 'grow_gpu_hist' |
grow_gpu_hist | Equivalent to the XGBoost fast histogram algorithm. Faster and uses considerably less memory. Splits may be less accurate. |
### Supported parameters
| parameter | grow_gpu | grow_gpu_hist |
| --- | --- | --- |
subsample | ✔ | ✔ |
colsample_bytree | ✔ | ✔|
colsample_bylevel | ✔ | ✔ |
max_bin | ✖ | ✔ |
gpu_id | ✔ | ✔ |
All algorithms currently use only a single GPU. The device ordinal can be selected using the 'gpu_id' parameter, which defaults to 0.
@@ -14,13 +25,32 @@ This plugin currently works with the CLI version and python version.
Python example:
```python
param['gpu_id'] = 1
param['updater'] = 'grow_gpu'
param['max_bin'] = 16
param['updater'] = 'grow_gpu_hist'
```
## Benchmarks
To run benchmarks on synthetic data for binary classification:
```bash
$ python benchmark/benchmark.py
```
[See here](http://dmlc.ml/2016/12/14/GPU-accelerated-xgboost.html) for performance benchmarks of the 'grow_gpu' updater.
Training time time on 1000000 rows x 50 columns with 500 boosting iterations on i7-6700K CPU @ 4.00GHz and Pascal Titan X.
| Updater | Time (s) |
| --- | --- |
| grow_gpu_hist | 11.09 |
| grow_fast_histmaker (histogram XGBoost - CPU) | 41.75 |
| grow_gpu | 193.90 |
| grow_colmaker (standard XGBoost - CPU) | 720.12 |
[See here](http://dmlc.ml/2016/12/14/GPU-accelerated-xgboost.html) for additional performance benchmarks of the 'grow_gpu' updater.
## Test
To run tests:
```bash
$ python -m nose test/
```
## Dependencies
A CUDA capable GPU with at least compute capability >= 3.5 (the algorithm depends on shuffle and vote instructions introduced in Kepler).
@@ -48,13 +78,17 @@ $ cmake .. -G"Visual Studio 12 2013 Win64" -DPLUGIN_UPDATER_GPU=ON -DCUB_DIRECTO
```
You may also be able to use a later version of visual studio depending on whether the CUDA toolkit supports it.
On an linux cmake will generate a Makefile in the build directory. Invoking the command 'make' from this directory will build the project. If the build fails try invoking make again. There can sometimes be problems with the order items are built.
On linux cmake will generate a Makefile in the build directory. Invoking the command 'make' from this directory will build the project. If the build fails try invoking make again. There can sometimes be problems with the order items are built.
On Windows cmake will generate an xgboost.sln solution file in the build directory. Build this solution in release mode. This is also a good time to check it is being built as x64. If not make sure the cmake generator is set correctly.
The build process generates an xgboost library and executable as normal but containing the GPU tree construction algorithm.
## Changelog
##### 2017/5/5
* Histogram performance improvements
* Fix gcc build issues
##### 2017/4/25
* Add fast histogram algorithm
* Fix Linux build