Rong Ou
|
e164d51c43
|
Improve allgather functions (#9649)
|
2023-10-12 23:31:43 +08:00 |
|
Jiaming Yuan
|
b438d684d2
|
Utilities and cleanups for socket. (#9576)
- Use c++-17 nodiscard and nested ns.
- Add bind method to socket.
- Remove rabit parameters.
|
2023-09-14 01:41:42 +08:00 |
|
Rong Ou
|
cbf98cb9c6
|
Add Allgather to collective communicator (#8765)
* Add Allgather to collective communicator
|
2023-02-09 11:31:22 +08:00 |
|
Rong Ou
|
a8255ea678
|
Add an in-memory collective communicator (#8494)
|
2022-12-01 00:24:12 +08:00 |
|
Rong Ou
|
39afdac3be
|
Better error message when world size and rank are set as strings (#8316)
Co-authored-by: jiamingy <jm.yuan@outlook.com>
|
2022-10-12 15:53:25 +08:00 |
|
Rong Ou
|
8d4038da57
|
Don't split input data in federated mode (#8279)
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
|
2022-10-05 18:19:28 -08:00 |
|
Rong Ou
|
668b8a0ea4
|
[Breaking] Switch from rabit to the collective communicator (#8257)
* Switch from rabit to the collective communicator
* fix size_t specialization
* really fix size_t
* try again
* add include
* more include
* fix lint errors
* remove rabit includes
* fix pylint error
* return dict from communicator context
* fix communicator shutdown
* fix dask test
* reset communicator mocklist
* fix distributed tests
* do not save device communicator
* fix jvm gpu tests
* add python test for federated communicator
* Update gputreeshap submodule
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
|
2022-10-05 14:39:01 -08:00 |
|
Rong Ou
|
a2686543a9
|
Common interface for collective communication (#8057)
* implement broadcast for federated communicator
* implement allreduce
* add communicator factory
* add device adapter
* add device communicator to factory
* add rabit communicator
* add rabit communicator to the factory
* add nccl device communicator
* add synchronize to device communicator
* add back print and getprocessorname
* add python wrapper and c api
* clean up types
* fix non-gpu build
* try to fix ci
* fix std::size_t
* portable string compare ignore case
* c style size_t
* fix lint errors
* cross platform setenv
* fix memory leak
* fix lint errors
* address review feedback
* add python test for rabit communicator
* fix failing gtest
* use json to configure communicators
* fix lint error
* get rid of factories
* fix cpu build
* fix include
* fix python import
* don't export collective.py yet
* skip collective communicator pytest on windows
* add review feedback
* update documentation
* remove mpi communicator type
* fix tests
* shutdown the communicator separately
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
|
2022-09-12 15:21:12 -07:00 |
|