743 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
09d32f1f2b
Fix build and C++ tests for FreeBSD (#10480) 2024-06-28 01:47:55 -07:00
Jiaming Yuan
e8a962575a
[EM] Allow staging ellpack on host for GPU external memory. (#10488)
- New parameter `on_host`.
- Abstract format creation and stream creation into policy classes.
2024-06-28 04:42:18 +08:00
Jiaming Yuan
26eb68859f
Consistently report error in tests. (#10453) 2024-06-21 14:35:22 +08:00
Jiaming Yuan
e5f1720656
[EM] Avoid writing cut matrix to cache. (#10444) 2024-06-19 18:03:38 +08:00
Jiaming Yuan
b9e5229ff2
Update rapids (#10435)
* [CI] Update RAPIDS to latest stable

* RMM.

---------

Co-authored-by: hcho3 <2532981+hcho3@users.noreply.github.com>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-06-18 05:01:57 +08:00
Jiaming Yuan
c9f5fcaf21
[col] Small cleanup to federated comm. (#10397) 2024-06-07 21:19:04 +08:00
Jiaming Yuan
d2d01d977a
Remove unnecessary fetch operations in external memory. (#10342) 2024-05-31 13:16:40 +08:00
Jiaming Yuan
e6eefea5e2
[coll] Move the rabit poll helper. (#10349) 2024-05-31 08:02:21 +08:00
Jiaming Yuan
d5fcbee44b
Add timeout for distributed tests. (#10315) 2024-05-23 11:11:49 +08:00
Jiaming Yuan
1b25d23583
[JVM-packages] Prevent memory leak. (#10307) 2024-05-22 13:47:59 +08:00
Dmitry Razdoburdin
c7e7ce7569
[SYCL] Add nodes initialisation (#10269)
---------

Co-authored-by: Dmitry Razdoburdin <>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-05-21 23:38:52 +08:00
Jiaming Yuan
a5a58102e5
Revamp the rabit implementation. (#10112)
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
Jiaming Yuan
835e59e538
Use a thread pool for external memory. (#10288) 2024-05-16 19:32:12 +08:00
Dmitry Razdoburdin
f588252481
[sycl] add loss guided hist building (#10251)
Co-authored-by: Dmitry Razdoburdin <>
2024-05-10 22:35:13 +08:00
Dmitry Razdoburdin
dcc9639b91
[sycl] add data initialisation for training (#10222)
Co-authored-by: Dmitry Razdoburdin <>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-05-05 12:07:10 +08:00
Jiaming Yuan
5e64276a9b
Update nvtx. (#10227) 2024-04-29 06:33:46 +08:00
Dmitry Razdoburdin
58513dc288
[SYCL] Add sampling initialization (#10216)
---------

Co-authored-by: Dmitry Razdoburdin <>
2024-04-25 04:35:52 +08:00
Jiaming Yuan
3fbb221fec
[coll] Implement shutdown for tracker and comm. (#10208)
- Force shutdown the tracker.
- Implement shutdown notice for error handling thread in comm.
2024-04-20 04:08:17 +08:00
Jiaming Yuan
3f64b4fde3
[coll] Add global functions. (#10203) 2024-04-19 03:17:23 +08:00
Jiaming Yuan
4b10200456
[coll] Improve event loop. (#10199)
- Add a test for blocking calls.
- Do not require the queue to be empty after waking up; this frees up the thread to answer blocking calls.
- Handle EOF in read.
- Improve the error message in the result. Allow concatenation of multiple results.
2024-04-18 03:29:52 +08:00
Dmitry Razdoburdin
6e5c335cea
[SYCL] Add basic features for QuantileHistMaker (#10174)
---------

Co-authored-by: Dmitry Razdoburdin <>
2024-04-15 21:24:46 +08:00
Jiaming Yuan
8bad677c2f
Update collective implementation. (#10152)
* Update collective implementation.

- Cleanup resource during `Finalize` to avoid handling threads in destructor.
- Calculate the size for allgather automatically.
- Use simple allgather for small (smaller than the number of worker) allreduce.
2024-03-30 18:57:31 +08:00
Jiaming Yuan
230010d9a0
Cleanup set info. (#10139)
- Use the array interface internally.
- Deprecate `XGDMatrixSetDenseInfo`.
- Deprecate `XGDMatrixSetUIntInfo`.
- Move the handling of `DataType` into the deprecated C function.

---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-03-26 23:26:24 +08:00
Dmitry Razdoburdin
6a7c6a8ae6
add sycl reaslisation of ghist builder (#10138)
Co-authored-by: Dmitry Razdoburdin <>
2024-03-23 12:55:25 +08:00
Jiaming Yuan
53fc17578f
Use std::uint64_t for row index. (#10120)
- Use std::uint64_t instead of size_t to avoid implementation-defined type.
- Rename to bst_idx_t, to account for other types of indexing.
- Small cleanup to the base header.
2024-03-15 18:43:49 +08:00
Dmitry Razdoburdin
617970a0c2
[SYCL] Add split evaluation (#10119)
---------

Co-authored-by: Dmitry Razdoburdin <>
2024-03-15 01:46:46 +08:00
Jiaming Yuan
1450aebb74
Fix pairwise objective with NDCG metric along with custom gain. (#10100)
* Fix pairwise objective with NDCG metric.

- Allow setting `ndcg_exp_gain` for `rank:pairwise`.

This is useful when using pairwise for objective but ndcg for metric.
2024-03-11 14:54:10 +08:00
Jiaming Yuan
2c13f90384
Support graphviz plot for multi-target tree. (#10093) 2024-03-09 05:35:25 +08:00
Jiaming Yuan
d07b7fe8c8
Small cleanup for mock tests. (#10085) 2024-03-04 23:32:11 +08:00
Dmitry Razdoburdin
7a61216690
[sycl] add partitioning and related tests (#10080)
Co-authored-by: Dmitry Razdoburdin <>
2024-03-02 01:49:27 +08:00
Jiaming Yuan
8189126d51
Add CUDA iterator to tensor view. (#10074) 2024-03-01 14:15:31 +08:00
Dmitry Razdoburdin
761845f594
[SYCL] Implement row set collection. (#10057)
Co-authored-by: Dmitry Razdoburdin <>
2024-02-26 21:07:36 +08:00
Jiaming Yuan
0ce4372bd4
Use UBJSON for serializing splits for vertical data split. (#10059) 2024-02-25 00:18:23 +08:00
Jiaming Yuan
2e4ea5ecc0
Support f64 for ubjson. (#10055) 2024-02-21 02:18:42 +08:00
Jiaming Yuan
d37b83e8d9
Fix UBJSON with boolean value. (#10054) 2024-02-20 22:13:51 +08:00
Louis Desreumaux
edf501d227
Implement contribution prediction with QuantileDMatrix (#10043)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-02-19 21:03:29 +08:00
Dmitry Razdoburdin
057f03cacc
[SYCL] Initial implementation of GHistIndexMatrix (#10045)
Co-authored-by: Dmitry Razdoburdin <>
2024-02-19 04:27:15 +08:00
Philip Hyunsu Cho
4dfbe2a893
[CI] Test building for 32-bit arch (#10021)
* [CI] Test building for 32-bit arch

* Update CMakeLists.txt

* Fix yaml

* Use Debian container

* Remove -Werror for 32-bit

* Revert "Remove -Werror for 32-bit"

This reverts commit c652bc6a037361bcceaf56fb01863210b462793d.

* Don't error for overloaded-virtual warning

* Ignore some warnings from dmlc-core

* Fix compiler warnings

* Fix formatting

* Apply suggestions from code review

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>

* Add more cast

---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-01-31 13:20:51 -08:00
Dmitry Razdoburdin
234674a0a6
[sync]. Add partition builder. (#10011)
---------

Co-authored-by: Dmitry Razdoburdin <>
2024-01-31 17:39:48 +08:00
Jiaming Yuan
cacb4b1fdd
Fix gain calculation in multi-target tree. (#9978) 2024-01-17 13:18:44 +08:00
Philip Hyunsu Cho
ef8bdaa047
[CI] Update machine images (#9932) 2023-12-29 11:15:38 -08:00
Jiaming Yuan
a7226c0222
Fix feature names with special characters. (#9923) 2023-12-28 22:45:13 +08:00
Dmitry Razdoburdin
43897b8296
Sycl implementation for objective functions (#9846)
---------

Co-authored-by: Dmitry Razdoburdin <>
2023-12-12 14:41:50 +08:00
Jiaming Yuan
42de9206fc
Support multi-target, fit intercept for hinge. (#9850) 2023-12-08 05:50:41 +08:00
Dmitry Razdoburdin
381f1d3dc9
Add support inference on SYCL devices (#9800)
---------

Co-authored-by: Dmitry Razdoburdin <>
Co-authored-by: Nikolay Petrov <nikolay.a.petrov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
2023-12-04 16:15:57 +08:00
Jiaming Yuan
8fe1a2213c
Cleanup code for distributed training. (#9805)
* Cleanup code for distributed training.

- Merge `GetNcclResult` into nccl stub.
- Split up utilities from the main dask module.
- Let Channel return `Result` to accommodate nccl channel.
- Remove old `use_label_encoder` parameter.
2023-11-25 09:10:56 +08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Jiaming Yuan
fedd9674c8
Implement column sampler in CUDA. (#9785)
- CUDA implementation.
- Extract the broadcasting logic, we will need the context parameter after revamping the collective implementation.
- Some changes to the event loop for fixing a deadlock in CI.
- Move argsort into algorithms.cuh, add support for cuda stream.
2023-11-17 04:29:08 +08:00
Jiaming Yuan
ada377c57e
[coll] Reduce the scope of lock in the event loop. (#9784) 2023-11-15 14:16:19 +08:00
Jiaming Yuan
6fd4a30667
[coll] Increase timeout for allgather test. (#9777) 2023-11-09 05:26:40 +08:00