[GPU-Plugin] Multi-GPU for grow_gpu_hist histogram method using NVIDIA NCCL. (#2395)
This commit is contained in:
committed by
Rory Mitchell
parent
e24f25e0c6
commit
41efe32aa5
@@ -17,8 +17,11 @@ colsample_bytree | ✔ | ✔|
|
||||
colsample_bylevel | ✔ | ✔ |
|
||||
max_bin | ✖ | ✔ |
|
||||
gpu_id | ✔ | ✔ |
|
||||
n_gpus | ✖ | ✔ |
|
||||
|
||||
All algorithms currently use only a single GPU. The device ordinal can be selected using the 'gpu_id' parameter, which defaults to 0.
|
||||
The device ordinal can be selected using the 'gpu_id' parameter, which defaults to 0.
|
||||
|
||||
Multiple GPUs can be used with the grow_gpu_hist parameter using the n_gpus parameter, which defaults to -1 (indicating use all visible GPUs). If gpu_id is specified as non-zero, the gpu device order is mod(gpu_id + i) % n_visible_devices for i=0 to n_gpus-1. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance. For example, when n_features * n_bins * 2^depth divided by time of each round/iteration becomes comparable to the real PCI 16x bus bandwidth of order 4GB/s to 10GB/s, then AllReduce will dominant code speed and multiple GPUs become ineffective at increasing performance. Also, CPU overhead between GPU calls can limit usefulness of multiple GPUs.
|
||||
|
||||
This plugin currently works with the CLI version and python version.
|
||||
|
||||
@@ -54,29 +57,38 @@ $ python -m nose test/python/
|
||||
## Dependencies
|
||||
A CUDA capable GPU with at least compute capability >= 3.5 (the algorithm depends on shuffle and vote instructions introduced in Kepler).
|
||||
|
||||
Building the plug-in requires CUDA Toolkit 7.5 or later.
|
||||
Building the plug-in requires CUDA Toolkit 7.5 or later (https://developer.nvidia.com/cuda-downloads)
|
||||
|
||||
submodule: The plugin also depends on CUB 1.6.4 - https://nvlabs.github.io/cub/ . CUB is a header only cuda library which provides sort/reduce/scan primitives.
|
||||
|
||||
submodule: NVIDIA NCCL from https://github.com/NVIDIA/nccl with windows port allowed by git@github.com:h2oai/nccl.git
|
||||
|
||||
## Build
|
||||
|
||||
### Using cmake
|
||||
To use the plugin xgboost must be built by specifying the option PLUGIN_UPDATER_GPU=ON. CMake will prepare a build system depending on which platform you are on.
|
||||
From the command line on Linux starting from the xgboost directory:
|
||||
|
||||
On Linux, from the xgboost directory:
|
||||
```bash
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake .. -DPLUGIN_UPDATER_GPU=ON
|
||||
$ make
|
||||
$ make -j
|
||||
```
|
||||
If 'make' fails try invoking make again. There can sometimes be problems with the order items are built.
|
||||
|
||||
On Windows you may also need to specify your generator as 64 bit, so the cmake command becomes:
|
||||
On Windows using cmake, see what options for Generators you have for cmake, and choose one with [arch] replaced by Win64:
|
||||
```bash
|
||||
$ cmake .. -G"Visual Studio 12 2013 Win64" -DPLUGIN_UPDATER_GPU=ON
|
||||
cmake -help
|
||||
```
|
||||
You may also be able to use a later version of visual studio depending on whether the CUDA toolkit supports it.
|
||||
cmake will generate an xgboost.sln solution file in the build directory. Build this solution in release mode. This is also a good time to check it is being built as x64. If not make sure the cmake generator is set correctly.
|
||||
Then run cmake as:
|
||||
```bash
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake .. -G"Visual Studio 14 2015 Win64" -DPLUGIN_UPDATER_GPU=ON
|
||||
```
|
||||
Cmake will generate an xgboost.sln solution file in the build directory. Build this solution in release mode as a x64 build.
|
||||
|
||||
Visual studio community 2015, supported by cuda toolkit (http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/#axzz4isREr2nS), can be downloaded from: https://my.visualstudio.com/Downloads?q=Visual%20Studio%20Community%202015 . You may also be able to use a later version of visual studio depending on whether the CUDA toolkit supports it. Note that Mingw cannot be used with cuda.
|
||||
|
||||
### For Developers!
|
||||
|
||||
### Using make
|
||||
Now, it also supports the usual 'make' flow to build gpu-enabled tree construction plugins. It's currently only tested on Linux. From the xgboost directory
|
||||
@@ -84,9 +96,6 @@ Now, it also supports the usual 'make' flow to build gpu-enabled tree constructi
|
||||
# make sure CUDA SDK bin directory is in the 'PATH' env variable
|
||||
$ make PLUGIN_UPDATER_GPU=ON
|
||||
```
|
||||
|
||||
### For Developers!
|
||||
|
||||
Now, some of the code-base inside gpu plugins have googletest unit-tests inside 'tests/'.
|
||||
They can be enabled run along with other unit-tests inside '<xgboostRoot>/tests/cpp' using:
|
||||
```bash
|
||||
@@ -98,10 +107,17 @@ $ make PLUGIN_UPDATER_GPU=ON GTEST_PATH=${CACHE_PREFIX} test
|
||||
```
|
||||
|
||||
## Changelog
|
||||
##### 2017/6/5
|
||||
|
||||
* Multi-GPU support for histogram method using NVIDIA NCCL.
|
||||
|
||||
##### 2017/5/31
|
||||
* Faster version of the grow_gpu plugin
|
||||
* Added support for building gpu plugin through 'make' flow too
|
||||
|
||||
##### 2017/5/19
|
||||
* Further performance enhancements for histogram method.
|
||||
|
||||
##### 2017/5/5
|
||||
* Histogram performance improvements
|
||||
* Fix gcc build issues
|
||||
@@ -115,10 +131,19 @@ $ make PLUGIN_UPDATER_GPU=ON GTEST_PATH=${CACHE_PREFIX} test
|
||||
[Mitchell, Rory, and Eibe Frank. Accelerating the XGBoost algorithm using GPU computing. No. e2911v1. PeerJ Preprints, 2017.](https://peerj.com/preprints/2911/)
|
||||
|
||||
## Author
|
||||
Rory Mitchell
|
||||
|
||||
Please report bugs to the xgboost/issues page. You can tag me with @RAMitchell.
|
||||
|
||||
Otherwise I can be contacted at r.a.mitchell.nz at gmail.
|
||||
<<<<<<< HEAD
|
||||
Rory Mitchell,
|
||||
Jonathan C. McKinney,
|
||||
Shankara Rao Thejaswi Nanditale,
|
||||
Vinay Deshpande,
|
||||
and the rest of the H2O.ai and NVIDIA team.
|
||||
=======
|
||||
Rory Mitchell
|
||||
Jonathan C. McKinney
|
||||
Shankara Rao Thejaswi Nanditale
|
||||
Vinay Deshpande
|
||||
... and the rest of the H2O.ai and NVIDIA team.
|
||||
>>>>>>> d2fbbdf4a39fa1f0af5cbd59a7912cf5caade34e
|
||||
|
||||
Please report bugs to the xgboost/issues page.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user