Clarify multi-GPU training, binary wheels, Pandas integration (#3581)
* Clarify multi-GPU training, binary wheels, Pandas integration * Add a note about multi-GPU on gpu/index.rst
This commit is contained in:
parent
7300002516
commit
4202332783
@ -4,15 +4,17 @@ Installation Guide
|
||||
|
||||
.. note:: Pre-built binary wheel for Python
|
||||
|
||||
If you are planning to use Python on a Linux system, consider installing XGBoost from a pre-built binary wheel. The wheel is available from Python Package Index (PyPI). You may download and install it by running
|
||||
If you are planning to use Python, consider installing XGBoost from a pre-built binary wheel, available from Python Package Index (PyPI). You may download and install it by running
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Ensure that you are downloading xgboost-{version}-py2.py3-none-manylinux1_x86_64.whl
|
||||
# Ensure that you are downloading one of the following:
|
||||
# * xgboost-{version}-py2.py3-none-manylinux1_x86_64.whl
|
||||
# * xgboost-{version}-py2.py3-none-win_amd64.whl
|
||||
pip3 install xgboost
|
||||
|
||||
* This package will support GPU algorithms (`gpu_exact`, `gpu_hist`) on machines with NVIDIA GPUs.
|
||||
* Currently, PyPI has a binary wheel only for 64-bit Linux.
|
||||
* The binary wheel will support GPU algorithms (`gpu_exact`, `gpu_hist`) on machines with NVIDIA GPUs. **However, it will not support multi-GPU training; only single GPU will be used.** To enable multi-GPU training, download and install the binary wheel from `this page <https://s3-us-west-2.amazonaws.com/xgboost-wheels/list.html>`_.
|
||||
* Currently, we provide binary wheels for 64-bit Linux and Windows.
|
||||
|
||||
****************************
|
||||
Building XGBoost from source
|
||||
@ -187,13 +189,15 @@ After the build process successfully ends, you will find a ``xgboost.dll`` libra
|
||||
|
||||
Unofficial windows binaries and instructions on how to use them are hosted on `Guido Tapia's blog <http://www.picnet.com.au/blogs/guido/post/2016/09/22/xgboost-windows-x64-binaries-for-download/>`_.
|
||||
|
||||
.. _build_gpu_support:
|
||||
|
||||
Building with GPU support
|
||||
=========================
|
||||
XGBoost can be built with GPU support for both Linux and Windows using CMake. GPU support works with the Python package as well as the CLI version. See `Installing R package with GPU support`_ for special instructions for R.
|
||||
|
||||
An up-to-date version of the CUDA toolkit is required.
|
||||
|
||||
From the command line on Linux starting from the xgboost directory:
|
||||
From the command line on Linux starting from the XGBoost directory:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
@ -202,9 +206,16 @@ From the command line on Linux starting from the xgboost directory:
|
||||
cmake .. -DUSE_CUDA=ON
|
||||
make -j
|
||||
|
||||
.. note:: Windows requirements for GPU build
|
||||
.. note:: Enabling multi-GPU training
|
||||
|
||||
Only Visual C++ 2015 or 2013 with CUDA v8.0 were fully tested. Either install Visual C++ 2015 Build Tools separately, or as a part of Visual Studio 2015. If you already have Visual Studio 2017, the Visual C++ 2015 Toolchain componenet has to be installed using the VS 2017 Installer. Likely, you would need to use the VS2015 x64 Native Tools command prompt to run the cmake commands given below. In some situations, however, things run just fine from MSYS2 bash command line.
|
||||
By default, multi-GPU training is disabled and only a single GPU will be used. To enable multi-GPU training, set the option ``USE_NCCL=ON``. Multi-GPU training depends on NCCL2, available at `this link <https://developer.nvidia.com/nccl>`_. Since NCCL2 is only available for Linux machines, **multi-GPU training is available only for Linux**.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
mkdir build
|
||||
cd build
|
||||
cmake .. -DUSE_CUDA=ON -DUSE_NCCL=ON
|
||||
make -j
|
||||
|
||||
On Windows, see what options for generators you have for CMake, and choose one with ``[arch]`` replaced with Win64:
|
||||
|
||||
|
||||
@ -8,7 +8,7 @@ To install GPU support, checkout the :doc:`/build`.
|
||||
*********************************************
|
||||
CUDA Accelerated Tree Construction Algorithms
|
||||
*********************************************
|
||||
This plugin adds GPU accelerated tree construction and prediction algorithms to XGBoost.
|
||||
Tree construction (training) and prediction can be accelerated with CUDA-capable GPUs.
|
||||
|
||||
Usage
|
||||
=====
|
||||
@ -59,7 +59,11 @@ The device ordinal can be selected using the ``gpu_id`` parameter, which default
|
||||
|
||||
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the gpu device order is ``mod(gpu_id + i) % n_visible_devices`` for ``i=0`` to ``n_gpus-1``. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.
|
||||
|
||||
This plugin currently works with the CLI, python and R - see :doc:`/build` for details.
|
||||
.. note:: Enabling multi-GPU training
|
||||
|
||||
Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
|
||||
|
||||
The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.
|
||||
|
||||
.. code-block:: python
|
||||
:caption: Python example
|
||||
|
||||
@ -25,7 +25,8 @@ The XGBoost python module is able to load data from:
|
||||
- LibSVM text format file
|
||||
- Comma-separated values (CSV) file
|
||||
- NumPy 2D array
|
||||
- SciPy 2D sparse array, and
|
||||
- SciPy 2D sparse array
|
||||
- Pandas data frame, and
|
||||
- XGBoost binary buffer file.
|
||||
|
||||
(See :doc:`/tutorials/input_format` for detailed description of text input format.)
|
||||
@ -66,6 +67,14 @@ The data is stored in a :py:class:`DMatrix <xgboost.DMatrix>` object.
|
||||
csr = scipy.sparse.csr_matrix((dat, (row, col)))
|
||||
dtrain = xgb.DMatrix(csr)
|
||||
|
||||
* To load a Pandas data frame into :py:class:`DMatrix <xgboost.DMatrix>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
data = pandas.DataFrame(np.arange(12).reshape((4,3)), columns=['a', 'b', 'c'])
|
||||
label = pandas.DataFrame(np.random.randint(2, size=4))
|
||||
dtrain = xgb.DMatrix(data, label=label)
|
||||
|
||||
* Saving :py:class:`DMatrix <xgboost.DMatrix>` into a XGBoost binary file will make loading faster:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user