1455 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
ef8bdaa047
[CI] Update machine images (#9932) 2023-12-29 11:15:38 -08:00
Jiaming Yuan
a7226c0222
Fix feature names with special characters. (#9923) 2023-12-28 22:45:13 +08:00
Jiaming Yuan
6a5f6ba694
[CI] Add timeout for distributed GPU tests. (#9917) 2023-12-24 00:09:05 +08:00
Jiaming Yuan
1aa8c8d9be
Support more scipy types. (#9881) 2023-12-14 18:28:37 +08:00
Philip Hyunsu Cho
936b22fdf3
[CI] Upload libxgboost4j.dylib (M1) to S3 bucket (#9886)
* [CI] Upload libxgboost4j.dylib (M1) to S3 bucket

* Fix typo
2023-12-13 15:25:51 -08:00
github-actions[bot]
d530d37707
[CI] Update RAPIDS to latest stable (#9857) 2023-12-12 22:58:03 +09:00
Dmitry Razdoburdin
43897b8296
Sycl implementation for objective functions (#9846)
---------

Co-authored-by: Dmitry Razdoburdin <>
2023-12-12 14:41:50 +08:00
Jiaming Yuan
faf0f2df10
Support dataframe data format in native XGBoost. (#9828)
- Implement a columnar adapter.
- Refactor Python pandas handling code to avoid converting into a single numpy array.
- Add support in R for transforming columns.
- Support R data.frame and factor type.
2023-12-12 09:56:31 +08:00
Jiaming Yuan
42de9206fc
Support multi-target, fit intercept for hinge. (#9850) 2023-12-08 05:50:41 +08:00
Jiaming Yuan
4bc1f3a388
[R] Bump requirement to 4.3.0. (#9847) 2023-12-07 00:12:45 +08:00
Dmitry Razdoburdin
381f1d3dc9
Add support inference on SYCL devices (#9800)
---------

Co-authored-by: Dmitry Razdoburdin <>
Co-authored-by: Nikolay Petrov <nikolay.a.petrov@intel.com>
Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>
2023-12-04 16:15:57 +08:00
Jiaming Yuan
e78b46046e
[CI] Update R version on Linux. (#9835) 2023-12-02 11:03:17 +08:00
Jiaming Yuan
e9f149481e
[sklearn] Fix loading model attributes. (#9808) 2023-11-27 17:19:01 +08:00
Jiaming Yuan
3f4e22015a
Mark NCCL python test optional. (#9804)
Skip the tests if XGBoost is not compiled with dlopen.
2023-11-25 11:25:47 +08:00
Jiaming Yuan
8fe1a2213c
Cleanup code for distributed training. (#9805)
* Cleanup code for distributed training.

- Merge `GetNcclResult` into nccl stub.
- Split up utilities from the main dask module.
- Let Channel return `Result` to accommodate nccl channel.
- Remove old `use_label_encoder` parameter.
2023-11-25 09:10:56 +08:00
Jiaming Yuan
e9260de3f3
[breaking] Remove dense libsvm parser plugin. (#9799) 2023-11-23 00:12:39 +08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Jiaming Yuan
fedd9674c8
Implement column sampler in CUDA. (#9785)
- CUDA implementation.
- Extract the broadcasting logic, we will need the context parameter after revamping the collective implementation.
- Some changes to the event loop for fixing a deadlock in CI.
- Move argsort into algorithms.cuh, add support for cuda stream.
2023-11-17 04:29:08 +08:00
Bobby Wang
178cfe70a8
[pyspark][doc] Test and doc for stage-level scheduling. (#9786) 2023-11-16 18:15:59 +08:00
Jiaming Yuan
ada377c57e
[coll] Reduce the scope of lock in the event loop. (#9784) 2023-11-15 14:16:19 +08:00
Jiaming Yuan
6fd4a30667
[coll] Increase timeout for allgather test. (#9777) 2023-11-09 05:26:40 +08:00
Jiaming Yuan
44099f585d
[coll] Add C API for the tracker. (#9773) 2023-11-08 18:17:14 +08:00
Jiaming Yuan
06bdc15e9b
[coll] Pass context to various functions. (#9772)
* [coll] Pass context to various functions.

In the future, the `Context` object would be required for collective operations, this PR
passes the context object to some required functions to prepare for swapping out the
implementation.
2023-11-08 09:54:05 +08:00
Jiaming Yuan
6c0a190f6d
[coll] Add comm group. (#9759)
- Implement `CommGroup` for double dispatching.
- Small cleanup to tracker for handling abort.
2023-11-07 11:12:31 +08:00
Jiaming Yuan
c3a0622b49
Fix using categorical data with the score function of ranker. (#9753) 2023-11-07 07:29:11 +08:00
Jiaming Yuan
82828621d0
[doc] Add doc for linters and simplify c++ lint script. (#9750) 2023-11-07 05:03:30 +08:00
david-cortes
be20df8c23
[Python] Accept numpy generators as random_state (#9743)
* accept numpy generators for random_state

* make linter happy

* fix tests
2023-11-01 16:20:44 -07:00
Jiaming Yuan
4da4e092b5
[coll] Improvements and fixes for tracker and allreduce. (#9745)
- Allow the tracker to wait.
- Fix allreduce type cast
- Return args from the federated tracker.
2023-11-02 04:06:46 +08:00
Philip Hyunsu Cho
0ff8572737
[CI] Build libxgboost4j.dylib with CMAKE_OSX_DEPLOYMENT_TARGET (#9749) 2023-11-01 11:20:28 -07:00
Philip Hyunsu Cho
1b9ed4a4a1
[CI] Improve CI for Mac M1 (#9748)
* [CI] Improve CI for Mac M1

* Add -v flag

* Disable OpenMP in libxgboost4j.dylib

* Target MacOS 10.15+ to use C++17
2023-11-01 10:03:56 -07:00
Jiaming Yuan
bc995a4865
[coll] Add federated coll. (#9738)
- Define a new data type, the proto file is copied for now.
- Merge client and communicator into `FederatedColl`.
- Define CUDA variant.
- Migrate tests for CPU, add tests for CUDA.
2023-11-01 04:06:46 +08:00
Philip Hyunsu Cho
6b98305db4
[CI] Enable gmock in gtest (#9737) 2023-10-31 20:09:35 +08:00
Jiaming Yuan
80390e6cb6
[coll] Federated comm. (#9732) 2023-10-31 02:39:55 +08:00
Jiaming Yuan
6755179e77
[coll] Add nccl. (#9726) 2023-10-28 16:33:58 +08:00
Bobby Wang
1323531323
[pyspark] unify the way for determining whether runs on the GPU. (#9724) 2023-10-27 11:21:30 +08:00
Dmitry Razdoburdin
f41a08fda8
Add 'sycl' devices to the context (#9691)
Co-authored-by: Dmitry Razdoburdin <>
2023-10-26 22:17:56 +08:00
Jiaming Yuan
7a02facc9d
Serialize expand entry for allgather. (#9702) 2023-10-24 14:33:28 +08:00
Jiaming Yuan
3ca06ac51e
[doc] Mention data consistency for categorical features. (#9678) 2023-10-24 10:11:33 +08:00
Philip Hyunsu Cho
5e6cb63a56
[CI] Set up CI for Mac M1 (#9699) 2023-10-22 23:33:19 -07:00
Jiaming Yuan
b771f58453
[coll] Define interface for bridging. (#9695)
* Define the basic interface that will shared by nccl, federated and native.
2023-10-20 16:20:48 +08:00
Rong Ou
6fbe6248f4
More in-memory input support for column split (#9685) 2023-10-20 16:02:36 +08:00
Chuck Atkins
83cdf14b2c
CMake LTO and CUDA arch (#9677) 2023-10-20 13:01:37 +08:00
Philip Hyunsu Cho
3b86260b50
Fix build for AppleClang 11 (#9684) (#9693) 2023-10-18 12:27:21 -07:00
Jiaming Yuan
5d1bcde719
[coll] allgatherv. (#9688) 2023-10-19 03:13:50 +08:00
Jiaming Yuan
4c0e4422d0
[coll] allgather. (#9681) 2023-10-18 10:22:18 +08:00
Jiaming Yuan
48ac9b6cbe
[coll] Allreduce. (#9679) 2023-10-17 13:57:14 +08:00
Rong Ou
da6803b75b
Support column-wise data split with in-memory inputs (#9628)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-17 12:16:39 +08:00
James Lamb
eb562d3829
[CI] address cmakelint warnings about whitespace (#9674) 2023-10-14 12:46:07 +08:00
Jiaming Yuan
53049b16b8
[coll] Broadcast. (#9659) 2023-10-14 09:34:37 +08:00
Philip Hyunsu Cho
a5e07a01f8
[CI] Pull CentOS 7 images from NGC (#9666) 2023-10-13 12:11:54 +08:00