52 Commits

Author SHA1 Message Date
Jiaming Yuan
73afef1a6e
Fixes for numpy 2.0. (#10252) 2024-05-07 03:54:32 +08:00
Philip Hyunsu Cho
edb945d59b
[CI] Use native arm64 worker in GHAction to build M1 wheel (#10225)
* [CI] Use native arm64 worker in GHAction to build M1 wheel

* Set up Conda

* Use mamba

* debug

* fix

* fix

* fix

* fix

* fix

* Temporarily disable other tests

* Fix prefix

* Use micromamba

* Use conda-incubator/setup-miniconda

* Use mambaforge

* Fix

* Fix prefix

* Don't use deprecated set-output

* Add verbose output from build

* verbose

* Specify arch

* Bump setup-miniconda to v3

* Use Python 3.9

* Restore deleted files

* WAR.

---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-04-26 10:16:55 -07:00
Bobby Wang
8fb05c8c95
[pyspark] support stage-level for yarn/k8s (#10209) 2024-04-20 00:24:40 +08:00
Jiaming Yuan
ca4801f81d
Work with IPv6 in the new tracker. (#10125) 2024-03-20 05:19:23 +08:00
Jiaming Yuan
8ea705e4d5
Support sample weight in sklearn custom objective. (#10050) 2024-02-21 00:43:14 +08:00
Jiaming Yuan
54b71c8fba
Fix with black 24.1.1. (#10014) 2024-01-30 17:24:11 +08:00
Jiaming Yuan
0798e36d73
[breaking] Remove deprecated parameters in the skl interface. (#9986) 2024-01-15 20:40:05 +08:00
Jiaming Yuan
b3eb5d0945
Use UBJ in Python checkpoint. (#9958) 2024-01-09 03:22:15 +08:00
Jiaming Yuan
6a5f6ba694
[CI] Add timeout for distributed GPU tests. (#9917) 2023-12-24 00:09:05 +08:00
Jiaming Yuan
e9f149481e
[sklearn] Fix loading model attributes. (#9808) 2023-11-27 17:19:01 +08:00
Jiaming Yuan
3f4e22015a
Mark NCCL python test optional. (#9804)
Skip the tests if XGBoost is not compiled with dlopen.
2023-11-25 11:25:47 +08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Bobby Wang
178cfe70a8
[pyspark][doc] Test and doc for stage-level scheduling. (#9786) 2023-11-16 18:15:59 +08:00
Bobby Wang
1323531323
[pyspark] unify the way for determining whether runs on the GPU. (#9724) 2023-10-27 11:21:30 +08:00
Jiaming Yuan
c75a3bc0a9
[breaking] [jvm-packages] Remove rabit check point. (#9599)
- Add `numBoostedRound` to jvm packages
- Remove rabit checkpoint version.
- Change the starting version of training continuation in JVM [breaking].
- Redefine the checkpoint version policy in jvm package. [breaking]
- Rename the Python check point callback parameter. [breaking]
- Unifies the checkpoint policy between Python and JVM.
2023-09-26 18:06:34 +08:00
Bobby Wang
6c791b5b47
[pyspark] support gpu transform (#9542)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-09-07 12:15:50 +08:00
Jiaming Yuan
302bbdc958
mitigate flaky test with distributed l1 error. (#9499) 2023-08-22 13:46:35 +08:00
Sean Yang
12fe2fc06c
Fix federated learning demos and tests (#9488) 2023-08-16 15:25:05 +08:00
Jiaming Yuan
19b59938b7
Convert input to str for hypothesis note. (#9480) 2023-08-15 02:27:58 +08:00
Jiaming Yuan
bdc1a3c178
Fix pyspark parameter. (#9460)
- Don't pass the `use_gpu` parameter to the learner.
- Fix GPU approx with PySpark.
2023-08-11 19:07:50 +08:00
Jiaming Yuan
54029a59af
Bound the size of the histogram cache. (#9440)
- A new histogram collection with a limit in size.
- Unify histogram building logic between hist, multi-hist, and approx.
2023-08-08 03:21:26 +08:00
Hendrik Makait
f958e32683
Raise if expected workers are not alive in xgboost.dask.train (#9421) 2023-08-03 20:14:07 +08:00
Jiaming Yuan
e93a274823
Small cleanup for histogram routines. (#9427)
* Small cleanup for histogram routines.

- Extract hist train param from GPU hist.
- Make histogram const after construction.
- Unify parameter names.
2023-08-02 18:28:26 +08:00
Jiaming Yuan
912e341d57
Initial GPU support for the approx tree method. (#9414) 2023-07-31 15:50:28 +08:00
Jiaming Yuan
6e18d3a290
[pyspark] Handle the device parameter in pyspark. (#9390)
- Handle the new `device` parameter in PySpark.
- Deprecate the old `use_gpu` parameter.
2023-07-18 08:47:03 +08:00
Jiaming Yuan
16eb41936d
Handle the new device parameter in dask and demos. (#9386)
* Handle the new `device` parameter in dask and demos.

- Check no ordinal is specified in the dask interface.
- Update demos.
- Update dask doc.
- Update the condition for QDM.
2023-07-15 19:11:20 +08:00
Jiaming Yuan
04aff3af8e
Define the new device parameter. (#9362) 2023-07-13 19:30:25 +08:00
Jiaming Yuan
e964654b8f
[skl] Enable cat feature without specifying tree method. (#9353) 2023-07-03 22:06:17 +08:00
Jiaming Yuan
ea0deeca68
Disable dense optimization in hist for distributed training. (#9272) 2023-06-10 02:31:34 +08:00
Jiaming Yuan
3913ff470f
Import data lazily during tests. (#9176) 2023-05-23 03:58:31 +08:00
Jiaming Yuan
e206b899ef
Rework MAP and Pairwise for LTR. (#9075) 2023-04-28 02:39:12 +08:00
Scott Gustafson
353ed5339d
Convert `DaskXGBClassifier.classes_` to an array (#8452)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-04-27 02:23:35 +08:00
WeichenXu
191d0aa5cf
[spark] Make spark model have the same UID with its estimator (#9022)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2023-04-14 02:53:30 +08:00
Jiaming Yuan
bac22734fb
Remove ntree limit in python package. (#8345)
- Remove `ntree_limit`. The parameter has been deprecated since 1.4.0.
- The SHAP package compatibility is broken.
2023-03-31 19:01:55 +08:00
Jiaming Yuan
151882dd26
Initial support for multi-target tree. (#8616)
* Implement multi-target for hist.

- Add new hist tree builder.
- Move data fetchers for tests.
- Dispatch function calls in gbm base on the tree type.
2023-03-22 23:49:56 +08:00
Jiaming Yuan
228a46e8ad
Support learning rate for zero-hessian objectives. (#8866) 2023-03-06 20:33:28 +08:00
Jiaming Yuan
6a892ce281
Specify src path for isort. (#8867) 2023-03-06 17:30:27 +08:00
mzzhang95
6cef9a08e9
[pyspark] Update eval_metric validation to support list of strings (#8826) 2023-03-02 08:24:12 +08:00
Jiaming Yuan
a2e433a089
Fix empty DMatrix with categorical features. (#8739) 2023-02-07 00:40:11 +08:00
Jiaming Yuan
0e61ba57d6
Fix GPU L1 error. (#8749) 2023-02-04 03:02:00 +08:00
Rong Ou
8af98e30fc
Use in-memory communicator to test quantile (#8710) 2023-01-27 23:28:28 +08:00
Jiaming Yuan
d6018eb4b9
Remove all use of DeviceQuantileDMatrix. (#8665) 2023-01-17 00:04:10 +08:00
Jiaming Yuan
e27cda7626
[CI] Skip pyspark sparse tests. (#8675) 2023-01-14 05:37:00 +08:00
Bobby Wang
72ec0c5484
[pyspark] support pred_contribs (#8633) 2023-01-11 16:51:12 +08:00
Jiaming Yuan
badeff1d74
Init estimation for regression. (#8272) 2023-01-11 02:04:56 +08:00
Jiaming Yuan
d308124910
Refactor PySpark tests. (#8605)
- Convert classifier tests to pytest tests.
- Replace hardcoded tests.
2023-01-04 17:05:16 +08:00
Jiaming Yuan
40343c8ee1
Test dask demos. (#8557)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2022-12-13 18:37:31 +08:00
Jiaming Yuan
e143a4dd7e
[pyspark] Refactor local tests. (#8525)
- Use pytest fixture for spark session.
- Replace hardcoded results.
2022-12-05 23:49:54 +08:00
Bobby Wang
8e41ad24f5
[pyspark] sort qid for SparkRanker (#8497)
* [pyspark] sort qid for SparkRandker

* resolve comments
2022-12-01 16:40:35 -08:00
Bobby Wang
2dde65f807
[ci] reduce pyspark test time (#8324) 2022-11-21 16:58:00 +08:00