3 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
ef8bdaa047
[CI] Update machine images (#9932) 2023-12-29 11:15:38 -08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Rong Ou
ed91e775ec
Fix quantile tests running on multi-gpus (#8775)
* Fix quantile tests running on multi-gpus

* Run some gtests with multiple GPUs

* fix mgpu test naming

* Instruct NCCL to print extra logs

* Allocate extra space in /dev/shm to enable NCCL

* use gtest_skip to skip mgpu tests

---------

Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2023-02-12 17:00:26 -08:00