[coll] Move the rabit poll helper. (#10349)
This commit is contained in:
@@ -138,7 +138,7 @@ From the command line on Linux starting from the XGBoost directory:
|
||||
|
||||
.. note:: Faster distributed GPU training with NCCL
|
||||
|
||||
By default, distributed GPU training is enabled and uses Rabit for communication. For faster training, set the option ``USE_NCCL=ON``. Faster distributed GPU training depends on NCCL2, available at `this link <https://developer.nvidia.com/nccl>`_. Since NCCL2 is only available for Linux machines, **faster distributed GPU training is available only for Linux**.
|
||||
By default, distributed GPU training is enabled with the option ``USE_NCCL=ON``. Distributed GPU training depends on NCCL2, available at `this link <https://developer.nvidia.com/nccl>`_. Since NCCL2 is only available for Linux machines, **Distributed GPU training is available only for Linux**.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
|
||||
@@ -37,7 +37,7 @@ The ultimate question will still come back to how to push the limit of each comp
|
||||
and use less resources to complete the task (thus with less communication and chance of failure).
|
||||
|
||||
To achieve these, we decide to reuse the optimizations in the single node XGBoost and build the distributed version on top of it.
|
||||
The demand for communication in machine learning is rather simple, in the sense that we can depend on a limited set of APIs (in our case rabit).
|
||||
The demand for communication in machine learning is rather simple, in the sense that we can depend on a limited set of APIs.
|
||||
Such design allows us to reuse most of the code, while being portable to major platforms such as Hadoop/Yarn, MPI, SGE.
|
||||
Most importantly, it pushes the limit of the computation resources we can use.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user