48 Commits

Author SHA1 Message Date
Jiaming Yuan
e6eefea5e2
[coll] Move the rabit poll helper. (#10349) 2024-05-31 08:02:21 +08:00
Jiaming Yuan
a5a58102e5
Revamp the rabit implementation. (#10112)
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
Jiaming Yuan
5e64276a9b
Update nvtx. (#10227) 2024-04-29 06:33:46 +08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Chuck Atkins
83cdf14b2c
CMake LTO and CUDA arch (#9677) 2023-10-20 13:01:37 +08:00
James Lamb
eb562d3829
[CI] address cmakelint warnings about whitespace (#9674) 2023-10-14 12:46:07 +08:00
James Lamb
2e42f33fc1
[CI] standardize else() and enfunction() calls in CMake scripts (#9653) 2023-10-12 11:14:19 +08:00
James Lamb
db8d117f7e
[CI] standardize endif() calls in CMake scripts (#9637) 2023-10-08 11:45:20 +08:00
Jiaming Yuan
90ef250ea1
[rabit] Drop support for MPI backend. (#9525)
- Add checks in cmake.
- Remove mpi related code.
2023-08-28 21:01:22 +08:00
Philip Hyunsu Cho
5bd163aa25
Explicitly specify libcudart_static in CMake config (#9436) 2023-08-05 14:15:44 -07:00
Rong Ou
7579905e18
Retry switching to per-thread default stream (#9416) 2023-07-26 07:09:12 +08:00
Jiaming Yuan
3a9996173e
Revert "Switch to per-thread default stream (#9396)" (#9413)
This reverts commit f7f673b00c15458fb4dd74a2a0d2ba80369c5faf.
2023-07-24 12:03:28 -07:00
Rong Ou
f7f673b00c
Switch to per-thread default stream (#9396) 2023-07-20 08:21:00 +08:00
Jiaming Yuan
7a0ccfbb49
Add compute 90. (#9397) 2023-07-19 13:42:38 +08:00
Jiaming Yuan
097f11b6e0
Support CUDA f16 without transformation. (#9207)
- Support f16 from cupy.
- Include CUDA header explicitly.
- Cleanup cmake nvtx support.
2023-05-30 20:54:31 +08:00
Jiaming Yuan
c5c8f643f2
Remove the cub submodule. (#8888)
XGBoost now uses CTK-11.8 for binary packages, there's no need to maintain a cub
submodule anymore.
2023-03-09 19:43:02 -08:00
Philip Hyunsu Cho
6d8afb2218
[CI] Require C++17 + CMake 3.18; Use CUDA 11.8 in CI (#8853)
* Update to C++17

* Turn off unity build

* Update CMake to 3.18

* Use MSVC 2022 + CUDA 11.8

* Re-create stack for worker images

* Allocate more disk space for Windows

* Tempiorarily disable clang-tidy

* RAPIDS now requires Python 3.10+

* Unpin cuda-python

* Use latest NCCL

* Use Ubuntu 20.04 in RMM image

* Mark failing mgpu test as xfail
2023-03-01 09:22:24 -08:00
Jiaming Yuan
bc818316f2
Prepare for improving Windows networking compatibility. (#8234)
* Prepare for improving Windows networking compatibility.

* Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which
  conflicts with winsock2.h
* Define `NOMINMAX` conditionally.
* Link the winsock library when mysys32 is used.
* Add config file for read the doc.
2022-09-10 15:16:49 +08:00
Rory Mitchell
f421c26d35
Tune cuda architectures (#8152) 2022-08-11 13:36:47 -07:00
Jiaming Yuan
142a208a90
Fix compiler warnings. (#8022)
- Remove/fix unused parameters
- Remove deprecated code in rabit.
- Update dmlc-core.
2022-06-22 21:29:10 +08:00
Jiaming Yuan
1a33b50a0d
Fix compiler warnings. (#7974)
- Remove unused parameters. There are still many warnings that are not yet
addressed. Currently, the warnings in dmlc-core dominate the error log.
- Remove `distributed` parameter from metric.
- Fixes some warnings about signed comparison.
2022-06-06 22:56:25 +08:00
Jiaming Yuan
d48123d23b
Fix rmm build (#7973)
- Optionally switch to c++17
- Use rmm CMake target.
- Workaround compiler errors.
- Fix GPUMetric inheritance.
- Run death tests even if it's built with RMM support.

Co-authored-by: jakirkham <jakirkham@gmail.com>
2022-06-06 20:18:32 +08:00
Jiaming Yuan
fd78af404b
Drop support for deprecated CUDA architectures. (#7774)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2022-03-31 21:42:23 +08:00
AJ Schmidt
511805c981
Compress fatbins (#7601)
* compress CUDA device code

Co-authored-by: ptaylor <paul.e.taylor@me.com>
2022-01-25 18:30:59 +08:00
Philip Hyunsu Cho
2a0368b7ca
Add CMake option to use /MD runtime (#7277) 2021-09-30 13:13:57 +08:00
Jiaming Yuan
c311a8c1d8
Enable compiling with system cub. (#7232)
- Tested with all CUDA 11.x.
- Workaround cub scan by using discard iterator in AUC.
- Limit the size of Argsort when compiled with CUDA cub.
2021-09-17 14:28:18 +08:00
Jiaming Yuan
cdfaa705f3
Fix building on CUDA 11.0. (#7187) 2021-08-25 02:57:53 -07:00
Jiaming Yuan
9c64618cb6
[breaking] Remove CUDA sm_35, add sm_86 (#7182) 2021-08-25 16:04:23 +08:00
Jiaming Yuan
345796825f
Optional find dependency in installed cmake config. (#7099)
* Find dependency only when xgboost is built as static library.
* Resolve msvc warning.
* Add test for linking shared library.
2021-07-11 17:20:55 +08:00
Rory Mitchell
29745c6df2
Fix inclusive scan for large sizes (#6234) 2020-11-03 17:01:43 +13:00
Jiaming Yuan
07355599c2
Option for generating device debug info. (#6168)
* Supply `-G;-src-in-ptx` when `USE_DEVICE_DEBUG` is set and debug mode is selected.
* Refactor CMake script to gather all CUDA configuration.
* Use CMAKE_CUDA_ARCHITECTURES.  Close #6029.
* Add compute 80.  Close #5999
2020-09-27 03:26:56 +08:00
James Lamb
d39da42e69
[R] Remove dependency on gendef for Visual Studio builds (fixes #5608) (#5764)
* [R-package] Remove dependency on gendef for Visual Studio builds (fixes #5608)

* clarify docs

* removed debugging print statement

* Make R CMake install more robust

* Fix doc format; add ToC

* Update build.rst

* Fix AppVeyor

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-06-15 00:20:44 +00:00
Jiaming Yuan
eaf2a00b5c
Enhance nvtx support. (#5636) 2020-05-06 22:54:24 +08:00
Darby Payne
7a99f8f27f
Adding static library option (#5397) 2020-03-10 18:22:15 +08:00
Rory Mitchell
5aa007d7b2
Fix visual studio output library directories (#5119) 2019-12-15 15:08:20 +13:00
Jiaming Yuan
7663de956c
Run training with empty DMatrix. (#4990)
This makes GPU Hist robust in distributed environment as some workers might not
be associated with any data in either training or evaluation.

* Disable rabit mock test for now: See #5012 .

* Disable dask-cudf test at prediction for now: See #5003

* Launch dask job for all workers despite they might not have any data.
* Check 0 rows in elementwise evaluation metrics.

   Using AUC and AUC-PR still throws an error.  See #4663 for a robust fix.

* Add tests for edge cases.
* Add `LaunchKernel` wrapper handling zero sized grid.
* Move some parts of allreducer into a cu file.
* Don't validate feature names when the booster is empty.

* Sync number of columns in DMatrix.

  As num_feature is required to be the same across all workers in data split
  mode.

* Filtering in dask interface now by default syncs all booster that's not
empty, instead of using rank 0.

* Fix Jenkins' GPU tests.

* Install dask-cuda from source in Jenkins' test.

  Now all tests are actually running.

* Restore GPU Hist tree synchronization test.

* Check UUID of running devices.

  The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field.

* Fix CMake policy and project variables.

  Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13.

* Fix copying data to CPU

* Fix race condition in cpu predictor.

* Fix duplicated DMatrix construction.

* Don't download extra nccl in CI script.
2019-11-06 16:13:13 +08:00
Philip Hyunsu Cho
c6f2a7e186
[CI] Add Windows GPU to Jenkins CI pipeline (#4463)
* Fix #4462: Use /MT flag consistently for MSVC target

* First attempt at Windows CI

* Distinguish stages in Linux and Windows pipelines

* Try running CMake in Windows pipeline

* Add build step
2019-05-14 04:45:06 +00:00
Rory Mitchell
d16d9a9988
Correctly determine cuda version (#4453) 2019-05-10 19:46:57 +12:00
Jiaming Yuan
207f058711 Refactor CMake scripts. (#4323)
* Refactor CMake scripts.

* Remove CMake CUDA wrapper.
* Bump CMake version for CUDA.
* Use CMake to handle Doxygen.
* Split up CMakeList.
* Export install target.
* Use modern CMake.
* Remove build.sh
* Workaround for gpu_hist test.
* Use cmake 3.12.

* Revert machine.conf.

* Move CLI test to gpu.

* Small cleanup.

* Support using XGBoost as submodule.

* Fix windows

* Fix cpp tests on Windows

* Remove duplicated find_package.
2019-04-15 10:08:12 -07:00
Rong Ou
9837b09b20 support cuda 10.1 (#4223)
* support cuda 10.1

* add cuda 10.1 to jenkins build matrix
2019-03-08 12:22:12 +13:00
Rory Mitchell
3ee725e3bb
Add cuda forwards compatibility (#3316) 2018-05-17 10:59:22 +12:00
Vadim Khotilovich
76f8f51438
[R] AppVeyor CI for R package (#2954)
* [R] fix finding R.exe with cmake on WIN when it is in PATH

* [R] appveyor config for R package

* [R] wrap the lines to make R check happier

* [R] install only binary dep-packages in appveyor

* [R] for MSVC appveyor, also build a binary for R package and keep as an artifact
2017-12-17 16:37:45 -06:00
Vadim Khotilovich
e8a6597957 [R] maintenance Nov 2017; SHAP plots (#2888)
* [R] fix predict contributions for data with no colnames

* [R] add a render parameter for xgb.plot.multi.trees; fixes #2628

* [R] update Rd's

* [R] remove unnecessary dep-package from R cmake install

* silence type warnings; readability

* [R] silence complaint about incomplete line at the end

* [R] initial version of xgb.plot.shap()

* [R] more work on xgb.plot.shap

* [R] enforce black font in xgb.plot.tree; fixes #2640

* [R] if feature names are available, check in predict that they are the same; fixes #2857

* [R] cran check and lint fixes

* remove tabs

* [R] add references; a test for plot.shap
2017-12-05 09:45:34 -08:00
Vadim Khotilovich
74db9757b3 [R package] GPU support (#2732)
* [R] MSVC compatibility

* [GPU] allow seed in BernoulliRng up to size_t and scale to uint32_t

* R package build with cmake and CUDA

* R package CUDA build fixes and cleanups

* always export the R package native initialization routine on windows

* update the install instructions doc

* fix lint

* use static_cast directly to set BernoulliRng seed

* [R] demo for GPU accelerated algorithm

* tidy up the R package cmake stuff

* R pack cmake: installs main dependency packages if needed

* [R] version bump in DESCRIPTION

* update NEWS

* added short missing/sparse values explanations to FAQ
2017-09-28 18:15:28 -05:00
Rory Mitchell
530f01e21c [GPU-Plugin] Add load balancing search to gpu_hist. Add compressed iterator. (#2504) 2017-07-11 22:36:39 +12:00
Rory Mitchell
e939192978 Cmake improvements (#2487)
* Cmake improvements
* Add google test to cmake
2017-07-06 18:05:11 +12:00
Rory Mitchell
48f3003302 [GPU-Plugin] Change GPU plugin to use tree_method parameter, bump cmake version to 3.5 for GPU plugin, add compute architecture 3.5, remove unused cmake files (#2455) 2017-06-29 16:19:45 +12:00
PSEUDOTENSOR / Jonathan McKinney
41efe32aa5 [GPU-Plugin] Multi-GPU for grow_gpu_hist histogram method using NVIDIA NCCL. (#2395) 2017-06-12 05:06:08 +12:00