28 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
ef8bdaa047
[CI] Update machine images (#9932) 2023-12-29 11:15:38 -08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
James Lamb
eb562d3829
[CI] address cmakelint warnings about whitespace (#9674) 2023-10-14 12:46:07 +08:00
James Lamb
2e42f33fc1
[CI] standardize else() and enfunction() calls in CMake scripts (#9653) 2023-10-12 11:14:19 +08:00
James Lamb
db8d117f7e
[CI] standardize endif() calls in CMake scripts (#9637) 2023-10-08 11:45:20 +08:00
Jiaming Yuan
f380c10a93
Use hint for find nccl. (#9490) 2023-08-16 16:08:41 +08:00
Jiaming Yuan
097f11b6e0
Support CUDA f16 without transformation. (#9207)
- Support f16 from cupy.
- Include CUDA header explicitly.
- Cleanup cmake nvtx support.
2023-05-30 20:54:31 +08:00
Jiaming Yuan
173096a6a7
Discover libasan.so.6. (#8864) 2023-03-06 18:56:54 +08:00
Jiaming Yuan
345796825f
Optional find dependency in installed cmake config. (#7099)
* Find dependency only when xgboost is built as static library.
* Resolve msvc warning.
* Add test for linking shared library.
2021-07-11 17:20:55 +08:00
Tanuja Kirthi Doddapaneni
d261ba029a
Added USE_NCCL_LIB_PATH option to enable user to set NCCL_LIBRARY during build (#6310)
Description: To enable user to set NCCL_LIBRARY during build
2020-10-28 14:36:31 -07:00
Jiaming Yuan
9b688aca3b
Fix mingw build with R. (#5918) 2020-07-22 02:56:49 +08:00
James Lamb
d39da42e69
[R] Remove dependency on gendef for Visual Studio builds (fixes #5608) (#5764)
* [R-package] Remove dependency on gendef for Visual Studio builds (fixes #5608)

* clarify docs

* removed debugging print statement

* Make R CMake install more robust

* Fix doc format; add ToC

* Update build.rst

* Fix AppVeyor

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-06-15 00:20:44 +00:00
Jiaming Yuan
eaf2a00b5c
Enhance nvtx support. (#5636) 2020-05-06 22:54:24 +08:00
James Lamb
7f980e9f83
[R-package] fixed inconsistency in R -e calls in FindLibR.cmake (#5438) 2020-03-28 19:24:21 +08:00
James Lamb
3cf665d3ec
[R-package] changed FindLibR to take advantage of CMake cache (#5427) 2020-03-20 03:32:15 +08:00
Philip Hyunsu Cho
2a071cebc5 Add CMake option to run Undefined Behavior Sanitizer (UBSan) (#5211)
* Fix related errors.
2020-01-20 16:57:44 +08:00
Jiaming Yuan
7663de956c
Run training with empty DMatrix. (#4990)
This makes GPU Hist robust in distributed environment as some workers might not
be associated with any data in either training or evaluation.

* Disable rabit mock test for now: See #5012 .

* Disable dask-cudf test at prediction for now: See #5003

* Launch dask job for all workers despite they might not have any data.
* Check 0 rows in elementwise evaluation metrics.

   Using AUC and AUC-PR still throws an error.  See #4663 for a robust fix.

* Add tests for edge cases.
* Add `LaunchKernel` wrapper handling zero sized grid.
* Move some parts of allreducer into a cu file.
* Don't validate feature names when the booster is empty.

* Sync number of columns in DMatrix.

  As num_feature is required to be the same across all workers in data split
  mode.

* Filtering in dask interface now by default syncs all booster that's not
empty, instead of using rank 0.

* Fix Jenkins' GPU tests.

* Install dask-cuda from source in Jenkins' test.

  Now all tests are actually running.

* Restore GPU Hist tree synchronization test.

* Check UUID of running devices.

  The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field.

* Fix CMake policy and project variables.

  Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13.

* Fix copying data to CPU

* Fix race condition in cpu predictor.

* Fix duplicated DMatrix construction.

* Don't download extra nccl in CI script.
2019-11-06 16:13:13 +08:00
Jiaming Yuan
6fac40cfb4
Add asan.so.5 to cmake script. (#4999) 2019-10-30 16:03:23 -04:00
Jiaming Yuan
8da4907e89
Enable building with shared NCCL. (#4447)
* Add `BUILD_WITH_SHARED_NCCL` to CMake.
2019-05-09 23:19:58 +08:00
Jiaming Yuan
207f058711 Refactor CMake scripts. (#4323)
* Refactor CMake scripts.

* Remove CMake CUDA wrapper.
* Bump CMake version for CUDA.
* Use CMake to handle Doxygen.
* Split up CMakeList.
* Export install target.
* Use modern CMake.
* Remove build.sh
* Workaround for gpu_hist test.
* Use cmake 3.12.

* Revert machine.conf.

* Move CLI test to gpu.

* Small cleanup.

* Support using XGBoost as submodule.

* Fix windows

* Fix cpp tests on Windows

* Remove duplicated find_package.
2019-04-15 10:08:12 -07:00
trivialfis
cf2d86a4f6 Add travis sanitizers tests. (#3557)
* Add travis sanitizers tests.

* Add gcc-7 in Travis.
* Add SANITIZER_PATH for CMake.
* Enable sanitizer tests in Travis.

* Fix memory leaks in tests.

* Fix all memory leaks reported by Address Sanitizer.
* tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.
2018-08-19 16:40:30 +12:00
trivialfis
55caad6e49 Remove redundant FindGTest.cmake. (#3533)
During removal of FindGTest.cmake, also

* Fix gtest include dirs.
* Remove some blanks and use PWD for gtest dir.
2018-08-07 10:08:08 +12:00
trivialfis
860263f814 Enable building with sanitizers. (#3525) 2018-07-31 17:25:47 +12:00
Thejaswi
2200939416 Upgrading to NCCL2 (#3404)
* Upgrading to NCCL2

* Part - II of NCCL2 upgradation

 - Doc updates to build with nccl2
 - Dockerfile.gpu update for a correct CI build with nccl2
 - Updated FindNccl package to have env-var NCCL_ROOT to take precedence

* Upgrading to v9.2 for CI workflow, since it has the nccl2 binaries available

* Added NCCL2 license + copy the nccl binaries into /usr location for the FindNccl module to find

* Set LD_LIBRARY_PATH variable to pick nccl2 binary at runtime

* Need the nccl2 library download instructions inside Dockerfile.release as well

* Use NCCL2 as a static library
2018-07-10 00:42:15 -07:00
Vadim Khotilovich
9ffe8596f2
[core] fix slow predict-caching with many classes (#3109)
* fix prediction caching inefficiency for multiclass

* silence some warnings

* redundant if

* workaround for R v3.4.3 bug; fixes #3081
2018-02-15 18:31:42 -06:00
Vadim Khotilovich
76f8f51438
[R] AppVeyor CI for R package (#2954)
* [R] fix finding R.exe with cmake on WIN when it is in PATH

* [R] appveyor config for R package

* [R] wrap the lines to make R check happier

* [R] install only binary dep-packages in appveyor

* [R] for MSVC appveyor, also build a binary for R package and keep as an artifact
2017-12-17 16:37:45 -06:00
Rory Mitchell
24f527a1c0
AVX gradients (#2878)
* AVX gradients

* Add google test for AVX

* Create fallback implementation, remove fma instruction

* Improved accuracy of AVX exp function
2017-11-27 08:56:01 +13:00
Vadim Khotilovich
74db9757b3 [R package] GPU support (#2732)
* [R] MSVC compatibility

* [GPU] allow seed in BernoulliRng up to size_t and scale to uint32_t

* R package build with cmake and CUDA

* R package CUDA build fixes and cleanups

* always export the R package native initialization routine on windows

* update the install instructions doc

* fix lint

* use static_cast directly to set BernoulliRng seed

* [R] demo for GPU accelerated algorithm

* tidy up the R package cmake stuff

* R pack cmake: installs main dependency packages if needed

* [R] version bump in DESCRIPTION

* update NEWS

* added short missing/sparse values explanations to FAQ
2017-09-28 18:15:28 -05:00