1410 Commits

Author SHA1 Message Date
Jiaming Yuan
e206b899ef
Rework MAP and Pairwise for LTR. (#9075) 2023-04-28 02:39:12 +08:00
Rong Ou
511d4996b5
Rely on gRPC to generate random port (#9102) 2023-04-27 09:48:26 +08:00
Scott Gustafson
353ed5339d
Convert `DaskXGBClassifier.classes_` to an array (#8452)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-04-27 02:23:35 +08:00
Rong Ou
a320b402a5
More refactoring to take advantage of collective aggregators (#9081) 2023-04-26 03:36:09 +08:00
Philip Hyunsu Cho
a5cd2412de
Replace setup.py with pyproject.toml (#9021)
* Create pyproject.toml
* Implement a custom build backend (see below) in packager directory. Build logic from setup.py has been refactored and migrated into the new backend.
* Tested: pip wheel . (build wheel), python -m build --sdist . (source distribution)
2023-04-20 13:51:39 -07:00
Rong Ou
42d100de18
Make sure metrics work with federated learning (#9037) 2023-04-19 15:39:11 +08:00
Jiaming Yuan
ef13dd31b1
Rework the NDCG objective. (#9015) 2023-04-18 21:16:06 +08:00
Rong Ou
ba9d24ff7b
Make sure metrics work with column-wise distributed training (#9020) 2023-04-18 03:48:23 +08:00
amdsc21
f645cf51c1 Merge branch 'master' into sync-condition-2023Apr11 2023-04-17 18:33:00 +02:00
WeichenXu
191d0aa5cf
[spark] Make spark model have the same UID with its estimator (#9022)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2023-04-14 02:53:30 +08:00
Philip Hyunsu Cho
8e0f320db3
[CI] Don't run CI automatically for dependabot (#9034) 2023-04-13 08:19:56 -07:00
amdsc21
843fdde61b sync Apr 11 2023 2023-04-11 20:03:25 +02:00
amdsc21
08bc4b0c0f Merge branch 'master' into sync-condition-2023Apr11 2023-04-11 19:38:38 +02:00
amdsc21
6825d986fd move Dockerfile to ci 2023-04-11 19:34:23 +02:00
Jiaming Yuan
fe9dff339c
Convert federated learner test into test suite. (#9018)
* Convert federated learner test into test suite.

- Add specialization to learning to rank.
2023-04-11 09:52:55 +08:00
Jiaming Yuan
2c8d735cb3
Fix tests with pandas 2.0. (#9014)
* Fix tests with pandas 2.0.

- `is_categorical` is replaced by `is_categorical_dtype`.
- one hot encoding returns boolean type instead of integer type.
2023-04-11 00:17:34 +08:00
Jiaming Yuan
1cf4d93246
Convert federated tests into test suite. (#9006)
- Add specialization for learning to rank.
2023-04-04 01:29:47 +08:00
Rong Ou
15e073ca9d
Make objectives work with vertical distributed and federated learning (#9002) 2023-04-03 17:07:42 +08:00
Jiaming Yuan
bac22734fb
Remove ntree limit in python package. (#8345)
- Remove `ntree_limit`. The parameter has been deprecated since 1.4.0.
- The SHAP package compatibility is broken.
2023-03-31 19:01:55 +08:00
Jiaming Yuan
d062a9e009
Define pair generation strategies for LTR. (#8984) 2023-03-30 12:00:35 +08:00
amdsc21
acad01afc9 sync Mar 29 2023-03-30 00:46:50 +02:00
Philip Hyunsu Cho
6676c28cbc
[CI] Fix Windows wheel to be compatible with Poetry (#8991)
* [CI] Fix Windows wheel to be compatible with Poetry

* Typo

* Eagerly scan globs to avoid patching same file twice
2023-03-28 21:32:54 -07:00
Rong Ou
ff26cd3212
More tests for column split and vertical federated learning (#8985)
Added some more tests for the learner and fit_stump, for both column-wise distributed learning and vertical federated learning.

Also moved the `IsRowSplit` and `IsColumnSplit` methods from the `DMatrix` to the `MetaInfo` since in some places we only have access to the `MetaInfo`. Added a new convenience method `IsVerticalFederatedLearning`.

Some refactoring of the testing fixtures.
2023-03-28 16:40:26 +08:00
amdsc21
06d9b998ce fix CAPI BuildInfo 2023-03-28 00:14:18 +02:00
amdsc21
c50cc424bc sync Mar 27 2023 2023-03-27 18:54:41 +02:00
Jiaming Yuan
401ce5cf5e
Run linters with the multi output demo. (#8966) 2023-03-28 00:47:28 +08:00
Jiaming Yuan
acc110c251
[MT-TREE] Support prediction cache and model slicing. (#8968)
- Fix prediction range.
- Support prediction cache in mt-hist.
- Support model slicing.
- Make the booster a Python iterable by defining `__iter__`.
- Cleanup removed/deprecated parameters.
- A new field in the output model `iteration_indptr` for pointing to the ranges of trees for each iteration.
2023-03-27 23:10:54 +08:00
Jiaming Yuan
c2b3a13e70
[breaking][skl] Remove parameter serialization. (#8963)
- Remove parameter serialization in the scikit-learn interface.

The scikit-lear interface `save_model` will save only the model and discard all
hyper-parameters. This is to align with the native XGBoost interface, which distinguishes
the hyper-parameter and model parameters.

With the scikit-learn interface, model parameters are attributes of the estimator. For
instance, `n_features_in_`, `n_classes_` are always accessible with
`estimator.n_features_in_` and `estimator.n_classes_`, but not with the
`estimator.get_params`.

- Define a `load_model` method for classifier to load its own attributes.

- Set n_estimators to None by default.
2023-03-27 21:34:10 +08:00
amdsc21
7ee4734d3a rm device_helpers.hip.h from cu 2023-03-26 00:24:11 +01:00
amdsc21
1474789787 add new file 2023-03-25 04:54:02 +01:00
amdsc21
7fbc561e17 initial merge 2023-03-25 04:31:55 +01:00
amdsc21
d97be6f396 enable last 3 tests 2023-03-25 04:05:05 +01:00
Jiaming Yuan
151882dd26
Initial support for multi-target tree. (#8616)
* Implement multi-target for hist.

- Add new hist tree builder.
- Move data fetchers for tests.
- Dispatch function calls in gbm base on the tree type.
2023-03-22 23:49:56 +08:00
Jiaming Yuan
ea04d4c46c
[doc] [dask] Troubleshooting NCCL errors. (#8943) 2023-03-22 22:17:26 +08:00
Jiaming Yuan
5891f752c8
Rework the MAP metric. (#8931)
- The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1.
- Deterministic GPU. (no atomic add).
- Fix top-k handling.
- Precise definition of MAP. (There are other variants on how to handle top-k).
- Refactor GPU ranking tests.
2023-03-22 17:45:20 +08:00
Rong Ou
b240f055d3
Support vertical federated learning (#8932) 2023-03-22 14:25:26 +08:00
Jiaming Yuan
a093770f36
Partitioner for multi-target tree. (#8922) 2023-03-16 18:49:34 +08:00
Jiaming Yuan
26209a42a5
Define git attributes for renormalization. (#8921) 2023-03-16 02:43:11 +08:00
Jiaming Yuan
f186c87cf9
Check inf in data for all types of DMatrix. (#8911) 2023-03-15 11:24:35 +08:00
amdsc21
8207015e48 fix ../tests/cpp/common/test_span.h 2023-03-14 22:19:06 +01:00
Jiaming Yuan
72e8331eab
Reimplement the NDCG metric. (#8906)
- Add support for non-exp gain.
- Cache the DMatrix object to avoid re-calculating the IDCG.
- Make GPU implementation deterministic. (no atomic add)
2023-03-15 03:26:17 +08:00
Jiaming Yuan
8685556af2
Implement hist evaluator for multi-target tree. (#8908) 2023-03-15 01:42:51 +08:00
Jiaming Yuan
910ce580c8
Clear all cache after model load. (#8904) 2023-03-14 22:09:36 +08:00
Jiaming Yuan
c400fa1e8d
Predictor for vector leaf. (#8898) 2023-03-14 19:07:10 +08:00
amdsc21
a2bab03205 fix aft_obj.hip 2023-03-13 23:19:59 +01:00
Jiaming Yuan
8be6095ece
Implement NDCG cache. (#8893) 2023-03-13 22:16:31 +08:00
Jiaming Yuan
9bade7203a
Remove public access to tree model param. (#8902)
* Make tree model param a private member.
* Number of features and targets are immutable after construction.

This is to reduce the number of places where we can run configuration.
2023-03-13 20:55:10 +08:00
Jiaming Yuan
5ba3509dd3
Define multi expand entry. (#8895) 2023-03-13 19:31:05 +08:00
amdsc21
fa2336fcfd sort bug fix 2023-03-12 07:09:10 +01:00
Jiaming Yuan
3689695d16
[CI] Run RMM gtests. (#8900)
* [CI] Run RMM gtests.

* Update test-cpp-gpu.sh

---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2023-03-12 03:14:31 +08:00