Jiaming Yuan
44099f585d
[coll] Add C API for the tracker. ( #9773 )
2023-11-08 18:17:14 +08:00
Jiaming Yuan
06bdc15e9b
[coll] Pass context to various functions. ( #9772 )
...
* [coll] Pass context to various functions.
In the future, the `Context` object would be required for collective operations, this PR
passes the context object to some required functions to prepare for swapping out the
implementation.
2023-11-08 09:54:05 +08:00
Jiaming Yuan
6c0a190f6d
[coll] Add comm group. ( #9759 )
...
- Implement `CommGroup` for double dispatching.
- Small cleanup to tracker for handling abort.
2023-11-07 11:12:31 +08:00
Jiaming Yuan
4da4e092b5
[coll] Improvements and fixes for tracker and allreduce. ( #9745 )
...
- Allow the tracker to wait.
- Fix allreduce type cast
- Return args from the federated tracker.
2023-11-02 04:06:46 +08:00
Jiaming Yuan
bc995a4865
[coll] Add federated coll. ( #9738 )
...
- Define a new data type, the proto file is copied for now.
- Merge client and communicator into `FederatedColl`.
- Define CUDA variant.
- Migrate tests for CPU, add tests for CUDA.
2023-11-01 04:06:46 +08:00
Philip Hyunsu Cho
6b98305db4
[CI] Enable gmock in gtest ( #9737 )
2023-10-31 20:09:35 +08:00
Jiaming Yuan
80390e6cb6
[coll] Federated comm. ( #9732 )
2023-10-31 02:39:55 +08:00
Jiaming Yuan
6755179e77
[coll] Add nccl. ( #9726 )
2023-10-28 16:33:58 +08:00
Dmitry Razdoburdin
f41a08fda8
Add 'sycl' devices to the context ( #9691 )
...
Co-authored-by: Dmitry Razdoburdin <>
2023-10-26 22:17:56 +08:00
Jiaming Yuan
7a02facc9d
Serialize expand entry for allgather. ( #9702 )
2023-10-24 14:33:28 +08:00
Philip Hyunsu Cho
5e6cb63a56
[CI] Set up CI for Mac M1 ( #9699 )
2023-10-22 23:33:19 -07:00
Jiaming Yuan
b771f58453
[coll] Define interface for bridging. ( #9695 )
...
* Define the basic interface that will shared by nccl, federated and native.
2023-10-20 16:20:48 +08:00
Philip Hyunsu Cho
3b86260b50
Fix build for AppleClang 11 ( #9684 ) ( #9693 )
2023-10-18 12:27:21 -07:00
Jiaming Yuan
5d1bcde719
[coll] allgatherv. ( #9688 )
2023-10-19 03:13:50 +08:00
Jiaming Yuan
4c0e4422d0
[coll] allgather. ( #9681 )
2023-10-18 10:22:18 +08:00
Jiaming Yuan
48ac9b6cbe
[coll] Allreduce. ( #9679 )
2023-10-17 13:57:14 +08:00
Rong Ou
da6803b75b
Support column-wise data split with in-memory inputs ( #9628 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-17 12:16:39 +08:00
James Lamb
eb562d3829
[CI] address cmakelint warnings about whitespace ( #9674 )
2023-10-14 12:46:07 +08:00
Jiaming Yuan
53049b16b8
[coll] Broadcast. ( #9659 )
2023-10-14 09:34:37 +08:00
Rong Ou
e164d51c43
Improve allgather functions ( #9649 )
2023-10-12 23:31:43 +08:00
Jiaming Yuan
946ae1c440
[coll] Implement a new tracker and a communicator. ( #9650 )
...
* [coll] Implement a new tracker and a communicator.
The new tracker and communicators communicate through the use of JSON documents. Along
with which, communicators are aware of each other.
2023-10-12 12:49:16 +08:00
James Lamb
2e42f33fc1
[CI] standardize else() and enfunction() calls in CMake scripts ( #9653 )
2023-10-12 11:14:19 +08:00
Rong Ou
0ecb4de963
[breaking] Change DMatrix construction to be distributed ( #9623 )
...
* Change column-split DMatrix construction to be distributed
* remove splitting code for row split
2023-10-10 23:35:57 +08:00
Jiaming Yuan
b14e535e78
[Coll] Implement get host address in libxgboost. ( #9644 )
...
- Port `xgboost.tracker.get_host_ip` in C++.
2023-10-10 10:01:14 +08:00
Jiaming Yuan
680d53db43
Extract JSON utils. ( #9645 )
2023-10-10 07:15:14 +08:00
James Lamb
db8d117f7e
[CI] standardize endif() calls in CMake scripts ( #9637 )
2023-10-08 11:45:20 +08:00
Rong Ou
3f2093fb81
Test monotone constraints with column split ( #9613 )
2023-09-28 04:54:53 +08:00
Rong Ou
d6d14d0fb9
Integration tests for interaction constraints with column-wise data split ( #9611 )
2023-09-27 08:27:43 +08:00
Rong Ou
290b17ffda
Test column sampler with column-wise data split ( #9609 )
2023-09-26 13:31:23 +08:00
Rong Ou
def77870f3
Test categorical features with column-split gpu quantile ( #9595 )
2023-09-23 09:55:09 +08:00
Jiaming Yuan
8c676c889d
Remove internal use of gpu_id. ( #9568 )
2023-09-20 23:29:51 +08:00
Jiaming Yuan
38ac52dd87
Build a simple event loop for collective. ( #9593 )
2023-09-20 02:09:07 +08:00
Rong Ou
d8c3cc92ae
More support for column split in gpu predictor ( #9562 )
2023-09-14 08:13:13 +08:00
Jiaming Yuan
300f9ace06
Fix default metric configuration. ( #9575 )
2023-09-13 13:05:47 -07:00
Jiaming Yuan
b438d684d2
Utilities and cleanups for socket. ( #9576 )
...
- Use c++-17 nodiscard and nested ns.
- Add bind method to socket.
- Remove rabit parameters.
2023-09-14 01:41:42 +08:00
Rong Ou
66a0832778
Add tests for gpu_approx ( #9553 )
2023-09-07 17:21:58 +08:00
Jiaming Yuan
adea842c83
Fix inplace predict with fallback when base margin is used. ( #9536 )
...
- Copy meta info from proxy DMatrix.
- Use `std::call_once` to emit less warnings.
2023-09-05 01:04:24 +08:00
Rong Ou
c928dd4ff5
Support vertical federated learning with gpu_hist ( #9539 )
2023-09-03 11:37:11 +08:00
Rong Ou
9bab06cbca
Support column split in gpu hist updater ( #9384 )
2023-08-31 18:09:35 +08:00
Jiaming Yuan
ccfc90e4c6
[rabit] Improved connection handling. ( #9531 )
...
- Enable timeout.
- Report connection error from the system.
- Handle retry for both tracker connection and peer connection.
2023-08-30 13:00:04 +08:00
Jiaming Yuan
ddf2e68821
Use the new DeviceOrd in the linalg module. ( #9527 )
2023-08-29 13:37:29 +08:00
Jiaming Yuan
972730cde0
Use matrix for gradient. ( #9508 )
...
- Use the `linalg::Matrix` for storing gradients.
- New API for the custom objective.
- Custom objective for multi-class/multi-target is now required to return the correct shape.
- Custom objective for Python can accept arrays with any strides. (row-major, column-major)
2023-08-24 05:29:52 +08:00
Rong Ou
6103dca0bb
Support column split in GPU evaluate splits ( #9511 )
2023-08-23 16:33:43 +08:00
Jiaming Yuan
3c09399f29
Fix device dispatch for linear updater. ( #9507 )
2023-08-23 00:17:35 +08:00
Jiaming Yuan
044fea1281
Drop support for loading remote files. ( #9504 )
2023-08-21 23:34:05 +08:00
Jiaming Yuan
1caa93221a
Use realloc for histogram cache and expose the cache limit. ( #9455 )
2023-08-10 14:05:27 +08:00
Jiaming Yuan
f05a23b41c
Use weakref instead of id for DataIter cache. ( #9445 )
...
- Fix case where Python reuses id from freed objects.
- Small optimization to column matrix with QDM by using `realloc` instead of copying data.
2023-08-10 00:40:06 +08:00
Philip Hyunsu Cho
7ce090e775
Handle UTF-8 paths correctly on Windows platform ( #9443 )
...
* Fix round-trip serialization with UTF-8 paths
* Add compiler version check
* Add comment to C API functions
* Add Python tests
* [CI] Updatre MacOS deployment target
* Use std::filesystem instead of dmlc::TemporaryDirectory
2023-08-07 23:27:25 -07:00
Jiaming Yuan
54029a59af
Bound the size of the histogram cache. ( #9440 )
...
- A new histogram collection with a limit in size.
- Unify histogram building logic between hist, multi-hist, and approx.
2023-08-08 03:21:26 +08:00
Rong Ou
bde1ebc209
Switch back to the GPUIDX macro ( #9438 )
2023-08-04 15:14:31 +08:00