Jiaming Yuan
ada377c57e
[coll] Reduce the scope of lock in the event loop. ( #9784 )
2023-11-15 14:16:19 +08:00
Jiaming Yuan
44099f585d
[coll] Add C API for the tracker. ( #9773 )
2023-11-08 18:17:14 +08:00
Jiaming Yuan
06bdc15e9b
[coll] Pass context to various functions. ( #9772 )
...
* [coll] Pass context to various functions.
In the future, the `Context` object would be required for collective operations, this PR
passes the context object to some required functions to prepare for swapping out the
implementation.
2023-11-08 09:54:05 +08:00
Jiaming Yuan
6c0a190f6d
[coll] Add comm group. ( #9759 )
...
- Implement `CommGroup` for double dispatching.
- Small cleanup to tracker for handling abort.
2023-11-07 11:12:31 +08:00
Hui Liu
c81731308c
fix RCCL
2023-11-02 16:39:24 -07:00
Hui Liu
51efb7442e
support HIP for half in coll
2023-11-02 10:53:12 -07:00
Hui Liu
3af5dfd546
Merge branch 'master'
2023-11-02 09:05:31 -07:00
Jiaming Yuan
4da4e092b5
[coll] Improvements and fixes for tracker and allreduce. ( #9745 )
...
- Allow the tracker to wait.
- Fix allreduce type cast
- Return args from the federated tracker.
2023-11-02 04:06:46 +08:00
Hui Liu
129bb76941
enable federated
2023-10-31 16:31:56 -07:00
Hui Liu
123af45327
Merge branch 'master'
2023-10-31 15:59:31 -07:00
Jiaming Yuan
bc995a4865
[coll] Add federated coll. ( #9738 )
...
- Define a new data type, the proto file is copied for now.
- Merge client and communicator into `FederatedColl`.
- Define CUDA variant.
- Migrate tests for CPU, add tests for CUDA.
2023-11-01 04:06:46 +08:00
Hui Liu
8fab17ae8f
rm hip.h files
2023-10-30 21:20:28 -07:00
Hui Liu
9b7aa1a7cd
unify cuda to hip
2023-10-30 17:12:06 -07:00
Hui Liu
4eb371b3f0
unify cuda to hip
2023-10-30 17:10:06 -07:00
Hui Liu
6df27eadc9
rm hip_category from source
2023-10-30 16:34:49 -07:00
Hui Liu
02f5464fa6
enable coll and comm
2023-10-30 15:15:05 -07:00
Hui Liu
b6b5218245
enable RCCL
2023-10-30 14:05:04 -07:00
Hui Liu
d7f1235b7d
Merge branch 'master' into sync-condition-2023Oct11
2023-10-30 13:19:33 -07:00
Hui Liu
1bedd76e94
rm un-necessary code
2023-10-30 13:14:45 -07:00
Jiaming Yuan
80390e6cb6
[coll] Federated comm. ( #9732 )
2023-10-31 02:39:55 +08:00
Jiaming Yuan
6755179e77
[coll] Add nccl. ( #9726 )
2023-10-28 16:33:58 +08:00
Hui Liu
32ae49ab92
temp hack for multi GPUs
2023-10-27 13:00:49 -07:00
Hui Liu
6bbca9a8b7
restore learner
2023-10-27 11:15:06 -07:00
Hui Liu
6762230d9a
namespace to reduce code
2023-10-27 10:51:32 -07:00
Hui Liu
4302200a33
Merge branch 'master' into sync-condition-2023Oct11
2023-10-27 10:09:37 -07:00
Hui Liu
4a4b528d54
add namespace aliases to reduce code
2023-10-27 09:11:55 -07:00
Dmitry Razdoburdin
9c22df9342
Fix mingw hanging on regex in context ( #9729 )
...
---------
Co-authored-by: Dmitry Razdoburdin <>
2023-10-27 20:01:35 +08:00
Dmitry Razdoburdin
f41a08fda8
Add 'sycl' devices to the context ( #9691 )
...
Co-authored-by: Dmitry Razdoburdin <>
2023-10-26 22:17:56 +08:00
Hui Liu
cd28b9f997
add back per-thread
2023-10-24 15:17:19 -07:00
Hui Liu
3752b06550
Merge branch 'master' into sync-condition-2023Oct11
2023-10-24 10:46:38 -07:00
Jiaming Yuan
7a02facc9d
Serialize expand entry for allgather. ( #9702 )
2023-10-24 14:33:28 +08:00
Hui Liu
79319dfd4d
format
2023-10-23 22:29:48 -07:00
Hui Liu
558352afc9
fix stream
2023-10-23 21:51:20 -07:00
Hui Liu
643b334919
add nccl_device_communicator.hip
2023-10-23 16:43:03 -07:00
Hui Liu
6ba66463b6
fix uuid and Clear/SetValid
2023-10-23 16:32:26 -07:00
Hui Liu
55994b1ac7
enable ROCm on latest XGBoost
2023-10-23 11:15:04 -07:00
Hui Liu
15421e40d9
enable ROCm on latest XGBoost
2023-10-23 11:07:08 -07:00
Philip Hyunsu Cho
5e6cb63a56
[CI] Set up CI for Mac M1 ( #9699 )
2023-10-22 23:33:19 -07:00
Jiaming Yuan
b771f58453
[coll] Define interface for bridging. ( #9695 )
...
* Define the basic interface that will shared by nccl, federated and native.
2023-10-20 16:20:48 +08:00
Philip Hyunsu Cho
3b86260b50
Fix build for AppleClang 11 ( #9684 ) ( #9693 )
2023-10-18 12:27:21 -07:00
Jiaming Yuan
5d1bcde719
[coll] allgatherv. ( #9688 )
2023-10-19 03:13:50 +08:00
Dmitry Razdoburdin
ea9f09716b
Reorder if-else statements to allow using of cpu branches for sycl-devices ( #9682 )
2023-10-18 10:55:33 +08:00
Jiaming Yuan
4c0e4422d0
[coll] allgather. ( #9681 )
2023-10-18 10:22:18 +08:00
Your Name
ffbbc9c968
add cuda to hip wrapper
2023-10-17 12:42:37 -07:00
Jiaming Yuan
48ac9b6cbe
[coll] Allreduce. ( #9679 )
2023-10-17 13:57:14 +08:00
Rong Ou
da6803b75b
Support column-wise data split with in-memory inputs ( #9628 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-17 12:16:39 +08:00
James Lamb
eb562d3829
[CI] address cmakelint warnings about whitespace ( #9674 )
2023-10-14 12:46:07 +08:00
Jiaming Yuan
53049b16b8
[coll] Broadcast. ( #9659 )
2023-10-14 09:34:37 +08:00
Your Name
ea19555474
temp merge, disable 1 line, SetValid
2023-10-12 16:16:44 -07:00
Rong Ou
e164d51c43
Improve allgather functions ( #9649 )
2023-10-12 23:31:43 +08:00