Jiaming Yuan
68a8865bc5
[CI] Fix PyLint errors. ( #10837 )
2024-09-24 14:09:32 +08:00
Jiaming Yuan
d5e1c41b69
[coll] Use loky for rabit op tests. ( #10828 )
2024-09-20 16:46:05 +08:00
Jiaming Yuan
34937fea41
[EM] Python wrapper for the ExtMemQuantileDMatrix. ( #10762 )
...
Not exposed to the document yet.
- Add C API.
- Add Python API.
- Basic CPU tests.
2024-08-29 04:08:25 +08:00
Jiaming Yuan
4fe67f10b4
[EM] Have one partitioner for each batch. ( #10760 )
...
- Initialize one partitioner for each batch.
- Collect partition size during initialization.
- Support base ridx in the finalization.
2024-08-29 01:35:17 +08:00
Jiaming Yuan
d6ebcfb032
[EM] Support CPU quantile objective for external memory. ( #10751 )
2024-08-27 04:16:57 +08:00
Jiaming Yuan
9b88495840
[multi] Implement weight feature importance. ( #10700 )
2024-08-22 02:06:47 +08:00
Jiaming Yuan
2258bc870d
Add more tests and doc for QDM. ( #10692 )
2024-08-16 23:30:04 +08:00
Jiaming Yuan
3d8107adb8
Support doc link for the sklearn module. ( #10287 )
2024-08-06 02:35:32 +08:00
Jiaming Yuan
a269055b2b
[coll] Use loky for tests. ( #10676 )
...
This makes the tests easier to run and debug. In addition, they can now work on Windows as
well.
2024-08-03 07:33:42 +08:00
Jiaming Yuan
827d0e8edb
[breaking] Bump Python requirement to 3.10. ( #10434 )
...
- Bump the Python requirement.
- Fix type hints.
- Use loky to avoid deadlock.
- Workaround cupy-numpy compatibility issue on Windows caused by the `safe` casting rule.
- Simplify the repartitioning logic to avoid dask errors.
2024-07-30 17:31:06 +08:00
Jiaming Yuan
e8a962575a
[EM] Allow staging ellpack on host for GPU external memory. ( #10488 )
...
- New parameter `on_host`.
- Abstract format creation and stream creation into policy classes.
2024-06-28 04:42:18 +08:00
Jiaming Yuan
824fba783e
Remove support for deprecated format in Python. ( #10490 )
2024-06-27 11:31:53 +08:00
Jiaming Yuan
b4cc350ec5
Fix categorical data with external memory. ( #10433 )
2024-06-18 04:34:54 +08:00
Philip Hyunsu Cho
bc3747bdce
[CI] Migrate to rockylinux8 / manylinux_2_28_x86_64 ( #10399 )
...
* [CI] Migrate to rockylinux8 / manylinux_2_28_x86_64
* Scrub all references to CentOS 7
* Fix
* Remove use of yum
* Use gcc-10 in cpu
* Temporarily disable -Werror
* Use GCC 9 for now
* Roll back gRPC
* Scrub all references to manylinux2014_x86_64
* Revise rename_whl.py to handle no-op rename
* Change JDK_VERSION back to 8
* Reviewer's comment
* Use GCC 10
* Use Spark 3.5.1, same as in pom.xml
* Fix JAR install
2024-06-17 12:07:49 -07:00
Jiaming Yuan
6c83c8c2ef
Allow blocking launch of federated tracker. ( #10414 )
...
---------
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-06-16 01:43:53 +08:00
Jiaming Yuan
d2d01d977a
Remove unnecessary fetch operations in external memory. ( #10342 )
2024-05-31 13:16:40 +08:00
Jiaming Yuan
a5a58102e5
Revamp the rabit implementation. ( #10112 )
...
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
Jiaming Yuan
ca1d04bcb7
Release data in cache. ( #10286 )
2024-05-14 14:20:19 +08:00
Jiaming Yuan
f1f69ff10e
[CI] Fixes for using the latest modin. ( #10285 )
2024-05-14 12:13:35 +08:00
Jiaming Yuan
d81e319e78
Fixes for the latest pandas. ( #10266 )
...
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-05-12 11:15:46 +08:00
Jiaming Yuan
73afef1a6e
Fixes for numpy 2.0. ( #10252 )
2024-05-07 03:54:32 +08:00
Jiaming Yuan
837d44a345
Support more sklearn tags for testing. ( #10230 )
2024-04-29 06:33:23 +08:00
Jiaming Yuan
1450aebb74
Fix pairwise objective with NDCG metric along with custom gain. ( #10100 )
...
* Fix pairwise objective with NDCG metric.
- Allow setting `ndcg_exp_gain` for `rank:pairwise`.
This is useful when using pairwise for objective but ndcg for metric.
2024-03-11 14:54:10 +08:00
Jiaming Yuan
e14c3b9325
Optional normalization for learning to rank. ( #10094 )
2024-03-08 12:41:21 +08:00
Jiaming Yuan
3941b31ade
Disable column sample by node for the exact tree method. ( #10083 )
2024-03-01 14:16:10 +08:00
Jiaming Yuan
8ea705e4d5
Support sample weight in sklearn custom objective. ( #10050 )
2024-02-21 00:43:14 +08:00
Jiaming Yuan
69a17d5114
Fix with None input. ( #10052 )
2024-02-20 22:34:22 +08:00
Louis Desreumaux
edf501d227
Implement contribution prediction with QuantileDMatrix ( #10043 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-02-19 21:03:29 +08:00
Jiaming Yuan
54b71c8fba
Fix with black 24.1.1. ( #10014 )
2024-01-30 17:24:11 +08:00
Jiaming Yuan
65d7bf2dfe
Handle np integer in model slice and prediction. ( #10007 )
2024-01-26 04:58:48 +08:00
Jiaming Yuan
d12cc1090a
Refactor tests for training continuation. ( #9997 )
2024-01-24 16:07:19 +08:00
Jiaming Yuan
0798e36d73
[breaking] Remove deprecated parameters in the skl interface. ( #9986 )
2024-01-15 20:40:05 +08:00
Jiaming Yuan
2f57bbde3c
Additional tests for attributes and model booosted rounds. ( #9962 )
2024-01-09 09:54:39 +08:00
Jiaming Yuan
b3eb5d0945
Use UBJ in Python checkpoint. ( #9958 )
2024-01-09 03:22:15 +08:00
Jiaming Yuan
9a30bdd313
Test loading models with invalid file extensions. ( #9955 )
2024-01-08 19:26:24 +08:00
Jiaming Yuan
38dd91f491
Save model in ubj as the default. ( #9947 )
2024-01-05 17:53:36 +08:00
Jiaming Yuan
c03a4d5088
Check support status for categorical features. ( #9946 )
2024-01-04 16:51:33 +08:00
Jiaming Yuan
621348abb3
Fix multi-output with alternating strategies. ( #9933 )
...
---------
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-01-04 16:41:13 +08:00
Jiaming Yuan
5f7b5a6921
Add tests for pickling with custom obj and metric. ( #9943 )
2024-01-04 14:52:48 +08:00
Jiaming Yuan
9f73127a23
Cleanup Python GPU tests. ( #9934 )
...
* Cleanup Python GPU tests.
- Remove the use of `gpu_hist` and `gpu_id` in cudf/cupy tests.
- Move base margin test into the testing directory.
2024-01-04 13:15:18 +08:00
Jiaming Yuan
a7226c0222
Fix feature names with special characters. ( #9923 )
2023-12-28 22:45:13 +08:00
Jiaming Yuan
1aa8c8d9be
Support more scipy types. ( #9881 )
2023-12-14 18:28:37 +08:00
Jiaming Yuan
faf0f2df10
Support dataframe data format in native XGBoost. ( #9828 )
...
- Implement a columnar adapter.
- Refactor Python pandas handling code to avoid converting into a single numpy array.
- Add support in R for transforming columns.
- Support R data.frame and factor type.
2023-12-12 09:56:31 +08:00
Jiaming Yuan
e9f149481e
[sklearn] Fix loading model attributes. ( #9808 )
2023-11-27 17:19:01 +08:00
Jiaming Yuan
c3a0622b49
Fix using categorical data with the score function of ranker. ( #9753 )
2023-11-07 07:29:11 +08:00
david-cortes
be20df8c23
[Python] Accept numpy generators as random_state ( #9743 )
...
* accept numpy generators for random_state
* make linter happy
* fix tests
2023-11-01 16:20:44 -07:00
Jiaming Yuan
3ca06ac51e
[doc] Mention data consistency for categorical features. ( #9678 )
2023-10-24 10:11:33 +08:00
Rong Ou
6fbe6248f4
More in-memory input support for column split ( #9685 )
2023-10-20 16:02:36 +08:00
Rong Ou
da6803b75b
Support column-wise data split with in-memory inputs ( #9628 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-17 12:16:39 +08:00
Jiaming Yuan
60526100e3
Support arrow through pandas ext types. ( #9612 )
...
- Use pandas extension type for pyarrow support.
- Additional support for QDM.
- Additional support for inplace_predict.
2023-09-28 17:00:16 +08:00