Jiaming Yuan
e9f149481e
[sklearn] Fix loading model attributes. ( #9808 )
2023-11-27 17:19:01 +08:00
Jiaming Yuan
3f4e22015a
Mark NCCL python test optional. ( #9804 )
...
Skip the tests if XGBoost is not compiled with dlopen.
2023-11-25 11:25:47 +08:00
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. ( #9796 )
...
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.
After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.
At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Bobby Wang
178cfe70a8
[pyspark][doc] Test and doc for stage-level scheduling. ( #9786 )
2023-11-16 18:15:59 +08:00
Bobby Wang
1323531323
[pyspark] unify the way for determining whether runs on the GPU. ( #9724 )
2023-10-27 11:21:30 +08:00
Jiaming Yuan
c75a3bc0a9
[breaking] [jvm-packages] Remove rabit check point. ( #9599 )
...
- Add `numBoostedRound` to jvm packages
- Remove rabit checkpoint version.
- Change the starting version of training continuation in JVM [breaking].
- Redefine the checkpoint version policy in jvm package. [breaking]
- Rename the Python check point callback parameter. [breaking]
- Unifies the checkpoint policy between Python and JVM.
2023-09-26 18:06:34 +08:00
Bobby Wang
6c791b5b47
[pyspark] support gpu transform ( #9542 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-09-07 12:15:50 +08:00
Jiaming Yuan
302bbdc958
mitigate flaky test with distributed l1 error. ( #9499 )
2023-08-22 13:46:35 +08:00
Sean Yang
12fe2fc06c
Fix federated learning demos and tests ( #9488 )
2023-08-16 15:25:05 +08:00
Jiaming Yuan
19b59938b7
Convert input to str for hypothesis note. ( #9480 )
2023-08-15 02:27:58 +08:00
Jiaming Yuan
bdc1a3c178
Fix pyspark parameter. ( #9460 )
...
- Don't pass the `use_gpu` parameter to the learner.
- Fix GPU approx with PySpark.
2023-08-11 19:07:50 +08:00
Jiaming Yuan
54029a59af
Bound the size of the histogram cache. ( #9440 )
...
- A new histogram collection with a limit in size.
- Unify histogram building logic between hist, multi-hist, and approx.
2023-08-08 03:21:26 +08:00
Hendrik Makait
f958e32683
Raise if expected workers are not alive in xgboost.dask.train ( #9421 )
2023-08-03 20:14:07 +08:00
Jiaming Yuan
e93a274823
Small cleanup for histogram routines. ( #9427 )
...
* Small cleanup for histogram routines.
- Extract hist train param from GPU hist.
- Make histogram const after construction.
- Unify parameter names.
2023-08-02 18:28:26 +08:00
Jiaming Yuan
912e341d57
Initial GPU support for the approx tree method. ( #9414 )
2023-07-31 15:50:28 +08:00
Jiaming Yuan
6e18d3a290
[pyspark] Handle the device parameter in pyspark. ( #9390 )
...
- Handle the new `device` parameter in PySpark.
- Deprecate the old `use_gpu` parameter.
2023-07-18 08:47:03 +08:00
Jiaming Yuan
16eb41936d
Handle the new device parameter in dask and demos. ( #9386 )
...
* Handle the new `device` parameter in dask and demos.
- Check no ordinal is specified in the dask interface.
- Update demos.
- Update dask doc.
- Update the condition for QDM.
2023-07-15 19:11:20 +08:00
Jiaming Yuan
04aff3af8e
Define the new device parameter. ( #9362 )
2023-07-13 19:30:25 +08:00
Jiaming Yuan
e964654b8f
[skl] Enable cat feature without specifying tree method. ( #9353 )
2023-07-03 22:06:17 +08:00
Jiaming Yuan
ea0deeca68
Disable dense optimization in hist for distributed training. ( #9272 )
2023-06-10 02:31:34 +08:00
Jiaming Yuan
3913ff470f
Import data lazily during tests. ( #9176 )
2023-05-23 03:58:31 +08:00
Jiaming Yuan
e206b899ef
Rework MAP and Pairwise for LTR. ( #9075 )
2023-04-28 02:39:12 +08:00
Scott Gustafson
353ed5339d
Convert `DaskXGBClassifier.classes_` to an array ( #8452 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-04-27 02:23:35 +08:00
WeichenXu
191d0aa5cf
[spark] Make spark model have the same UID with its estimator ( #9022 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2023-04-14 02:53:30 +08:00
Jiaming Yuan
bac22734fb
Remove ntree limit in python package. ( #8345 )
...
- Remove `ntree_limit`. The parameter has been deprecated since 1.4.0.
- The SHAP package compatibility is broken.
2023-03-31 19:01:55 +08:00
Jiaming Yuan
151882dd26
Initial support for multi-target tree. ( #8616 )
...
* Implement multi-target for hist.
- Add new hist tree builder.
- Move data fetchers for tests.
- Dispatch function calls in gbm base on the tree type.
2023-03-22 23:49:56 +08:00
Jiaming Yuan
228a46e8ad
Support learning rate for zero-hessian objectives. ( #8866 )
2023-03-06 20:33:28 +08:00
Jiaming Yuan
6a892ce281
Specify src path for isort. ( #8867 )
2023-03-06 17:30:27 +08:00
mzzhang95
6cef9a08e9
[pyspark] Update eval_metric validation to support list of strings ( #8826 )
2023-03-02 08:24:12 +08:00
Jiaming Yuan
a2e433a089
Fix empty DMatrix with categorical features. ( #8739 )
2023-02-07 00:40:11 +08:00
Jiaming Yuan
0e61ba57d6
Fix GPU L1 error. ( #8749 )
2023-02-04 03:02:00 +08:00
Rong Ou
8af98e30fc
Use in-memory communicator to test quantile ( #8710 )
2023-01-27 23:28:28 +08:00
Jiaming Yuan
d6018eb4b9
Remove all use of DeviceQuantileDMatrix. ( #8665 )
2023-01-17 00:04:10 +08:00
Jiaming Yuan
e27cda7626
[CI] Skip pyspark sparse tests. ( #8675 )
2023-01-14 05:37:00 +08:00
Bobby Wang
72ec0c5484
[pyspark] support pred_contribs ( #8633 )
2023-01-11 16:51:12 +08:00
Jiaming Yuan
badeff1d74
Init estimation for regression. ( #8272 )
2023-01-11 02:04:56 +08:00
Jiaming Yuan
d308124910
Refactor PySpark tests. ( #8605 )
...
- Convert classifier tests to pytest tests.
- Replace hardcoded tests.
2023-01-04 17:05:16 +08:00
Jiaming Yuan
40343c8ee1
Test dask demos. ( #8557 )
...
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2022-12-13 18:37:31 +08:00
Jiaming Yuan
e143a4dd7e
[pyspark] Refactor local tests. ( #8525 )
...
- Use pytest fixture for spark session.
- Replace hardcoded results.
2022-12-05 23:49:54 +08:00
Bobby Wang
8e41ad24f5
[pyspark] sort qid for SparkRanker ( #8497 )
...
* [pyspark] sort qid for SparkRandker
* resolve comments
2022-12-01 16:40:35 -08:00
Bobby Wang
2dde65f807
[ci] reduce pyspark test time ( #8324 )
2022-11-21 16:58:00 +08:00
Jiaming Yuan
0d3da9869c
Require isort on all Python files. ( #8420 )
2022-11-08 12:59:06 +08:00
Jiaming Yuan
cfd2a9f872
Extract dask and spark test into distributed test. ( #8395 )
...
- Move test files.
- Run spark and dask separately to prevent conflicts.
- Gather common code into the testing module.
2022-10-28 16:24:32 +08:00