1814 Commits

Author SHA1 Message Date
Jiaming Yuan
4da4e092b5
[coll] Improvements and fixes for tracker and allreduce. (#9745)
- Allow the tracker to wait.
- Fix allreduce type cast
- Return args from the federated tracker.
2023-11-02 04:06:46 +08:00
Hui Liu
129bb76941 enable federated 2023-10-31 16:31:56 -07:00
Hui Liu
123af45327 Merge branch 'master' 2023-10-31 15:59:31 -07:00
Jiaming Yuan
bc995a4865
[coll] Add federated coll. (#9738)
- Define a new data type, the proto file is copied for now.
- Merge client and communicator into `FederatedColl`.
- Define CUDA variant.
- Migrate tests for CPU, add tests for CUDA.
2023-11-01 04:06:46 +08:00
Hui Liu
8fab17ae8f rm hip.h files 2023-10-30 21:20:28 -07:00
Hui Liu
9b7aa1a7cd unify cuda to hip 2023-10-30 17:12:06 -07:00
Hui Liu
4eb371b3f0 unify cuda to hip 2023-10-30 17:10:06 -07:00
Hui Liu
6df27eadc9 rm hip_category from source 2023-10-30 16:34:49 -07:00
Hui Liu
02f5464fa6 enable coll and comm 2023-10-30 15:15:05 -07:00
Hui Liu
b6b5218245 enable RCCL 2023-10-30 14:05:04 -07:00
Hui Liu
d7f1235b7d Merge branch 'master' into sync-condition-2023Oct11 2023-10-30 13:19:33 -07:00
Hui Liu
1bedd76e94 rm un-necessary code 2023-10-30 13:14:45 -07:00
Jiaming Yuan
80390e6cb6
[coll] Federated comm. (#9732) 2023-10-31 02:39:55 +08:00
Jiaming Yuan
6755179e77
[coll] Add nccl. (#9726) 2023-10-28 16:33:58 +08:00
Hui Liu
32ae49ab92 temp hack for multi GPUs 2023-10-27 13:00:49 -07:00
Hui Liu
6bbca9a8b7 restore learner 2023-10-27 11:15:06 -07:00
Hui Liu
6762230d9a namespace to reduce code 2023-10-27 10:51:32 -07:00
Hui Liu
4302200a33 Merge branch 'master' into sync-condition-2023Oct11 2023-10-27 10:09:37 -07:00
Hui Liu
4a4b528d54 add namespace aliases to reduce code 2023-10-27 09:11:55 -07:00
Dmitry Razdoburdin
9c22df9342
Fix mingw hanging on regex in context (#9729)
---------

Co-authored-by: Dmitry Razdoburdin <>
2023-10-27 20:01:35 +08:00
Dmitry Razdoburdin
f41a08fda8
Add 'sycl' devices to the context (#9691)
Co-authored-by: Dmitry Razdoburdin <>
2023-10-26 22:17:56 +08:00
Hui Liu
cd28b9f997 add back per-thread 2023-10-24 15:17:19 -07:00
Hui Liu
3752b06550 Merge branch 'master' into sync-condition-2023Oct11 2023-10-24 10:46:38 -07:00
Jiaming Yuan
7a02facc9d
Serialize expand entry for allgather. (#9702) 2023-10-24 14:33:28 +08:00
Hui Liu
79319dfd4d format 2023-10-23 22:29:48 -07:00
Hui Liu
558352afc9 fix stream 2023-10-23 21:51:20 -07:00
Hui Liu
643b334919 add nccl_device_communicator.hip 2023-10-23 16:43:03 -07:00
Hui Liu
6ba66463b6 fix uuid and Clear/SetValid 2023-10-23 16:32:26 -07:00
Hui Liu
55994b1ac7 enable ROCm on latest XGBoost 2023-10-23 11:15:04 -07:00
Hui Liu
15421e40d9 enable ROCm on latest XGBoost 2023-10-23 11:07:08 -07:00
Philip Hyunsu Cho
5e6cb63a56
[CI] Set up CI for Mac M1 (#9699) 2023-10-22 23:33:19 -07:00
Jiaming Yuan
b771f58453
[coll] Define interface for bridging. (#9695)
* Define the basic interface that will shared by nccl, federated and native.
2023-10-20 16:20:48 +08:00
Philip Hyunsu Cho
3b86260b50
Fix build for AppleClang 11 (#9684) (#9693) 2023-10-18 12:27:21 -07:00
Jiaming Yuan
5d1bcde719
[coll] allgatherv. (#9688) 2023-10-19 03:13:50 +08:00
Dmitry Razdoburdin
ea9f09716b
Reorder if-else statements to allow using of cpu branches for sycl-devices (#9682) 2023-10-18 10:55:33 +08:00
Jiaming Yuan
4c0e4422d0
[coll] allgather. (#9681) 2023-10-18 10:22:18 +08:00
Your Name
ffbbc9c968 add cuda to hip wrapper 2023-10-17 12:42:37 -07:00
Jiaming Yuan
48ac9b6cbe
[coll] Allreduce. (#9679) 2023-10-17 13:57:14 +08:00
Rong Ou
da6803b75b
Support column-wise data split with in-memory inputs (#9628)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-17 12:16:39 +08:00
James Lamb
eb562d3829
[CI] address cmakelint warnings about whitespace (#9674) 2023-10-14 12:46:07 +08:00
Jiaming Yuan
53049b16b8
[coll] Broadcast. (#9659) 2023-10-14 09:34:37 +08:00
Your Name
ea19555474 temp merge, disable 1 line, SetValid 2023-10-12 16:16:44 -07:00
Rong Ou
e164d51c43
Improve allgather functions (#9649) 2023-10-12 23:31:43 +08:00
Jiaming Yuan
946ae1c440
[coll] Implement a new tracker and a communicator. (#9650)
* [coll] Implement a new tracker and a communicator.

The new tracker and communicators communicate through the use of JSON documents. Along
with which, communicators are aware of each other.
2023-10-12 12:49:16 +08:00
Jiaming Yuan
084d89216c
Add support for cgroupv2. (#9651) 2023-10-12 09:36:36 +08:00
Rong Ou
0ecb4de963
[breaking] Change DMatrix construction to be distributed (#9623)
* Change column-split DMatrix construction to be distributed

* remove splitting code for row split
2023-10-10 23:35:57 +08:00
Jiaming Yuan
b14e535e78
[Coll] Implement get host address in libxgboost. (#9644)
- Port `xgboost.tracker.get_host_ip` in C++.
2023-10-10 10:01:14 +08:00
Jiaming Yuan
680d53db43
Extract JSON utils. (#9645) 2023-10-10 07:15:14 +08:00
James Lamb
db8d117f7e
[CI] standardize endif() calls in CMake scripts (#9637) 2023-10-08 11:45:20 +08:00
Jiaming Yuan
4d7a187cb0
Remove XGBoosterGetModelRaw. (#9617)
Deprecated in 1.6.
2023-09-29 02:29:33 +08:00