943 Commits

Author SHA1 Message Date
Jiaming Yuan
fcae6301ec
[dask] Disable broadcast in the scatter call. (#10632) 2024-07-25 04:16:34 +08:00
Jiaming Yuan
0846ad860c
Optionally skip cupy on windows. (#10611) 2024-07-20 22:12:12 +08:00
Philip Hyunsu Cho
326921dbe4
[CI] Build a CPU-only wheel under name xgboost-cpu (#10603) 2024-07-19 10:51:08 -07:00
david-cortes
8d0f2bfbaa
[doc] Add more detailed explanations for advanced objectives (#10283)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-07-08 19:17:31 +08:00
Jiaming Yuan
00264eb72b
[EM] Basic distributed test for external memory. (#10492) 2024-07-06 01:15:20 +08:00
Jiaming Yuan
e537b0969f
Fix boolean array for arrow-backed DF. (#10527) 2024-07-02 17:02:54 +08:00
Jiaming Yuan
a39fef2c67
[fed] Fixes for the encrypted GRPC backend. (#10503) 2024-07-02 15:15:12 +08:00
Jiaming Yuan
e8a962575a
[EM] Allow staging ellpack on host for GPU external memory. (#10488)
- New parameter `on_host`.
- Abstract format creation and stream creation into policy classes.
2024-06-28 04:42:18 +08:00
Jiaming Yuan
824fba783e
Remove support for deprecated format in Python. (#10490) 2024-06-27 11:31:53 +08:00
Jiaming Yuan
2d88d17008
Remove deprecated DeviceQuantileDMatrix. (#10491) 2024-06-27 11:30:51 +08:00
Philip Hyunsu Cho
9a8bb7d186
Require Pandas 1.2+ (#10476) 2024-06-22 14:15:22 -07:00
Philip Hyunsu Cho
bc3747bdce
[CI] Migrate to rockylinux8 / manylinux_2_28_x86_64 (#10399)
* [CI] Migrate to rockylinux8 / manylinux_2_28_x86_64

* Scrub all references to CentOS 7

* Fix

* Remove use of yum

* Use gcc-10 in cpu

* Temporarily disable -Werror

* Use GCC 9 for now

* Roll back gRPC

* Scrub all references to manylinux2014_x86_64

* Revise rename_whl.py to handle no-op rename

* Change JDK_VERSION back to 8

* Reviewer's comment

* Use GCC 10

* Use Spark 3.5.1, same as in pom.xml

* Fix JAR install
2024-06-17 12:07:49 -07:00
Jiaming Yuan
6c83c8c2ef
Allow blocking launch of federated tracker. (#10414)
---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-06-16 01:43:53 +08:00
Jiaming Yuan
bbff74d2ff
[dask] Workaround the tokenizer by changing the scatter function. (#10419)
---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-06-15 19:10:00 +08:00
Richard (Rick) Zamora
dc14f98f40
Avoid default tokenization in Dask (#10398)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-06-14 19:44:54 +08:00
Bobby Wang
cf0c1d0888
[pyspark] Avoid repartition. (#10408) 2024-06-12 02:26:10 +08:00
Christopher Tee
e0ebbc0746
[doc] Fix small typos (#10405) 2024-06-11 16:13:02 +08:00
Jiaming Yuan
9f6608d6aa
Add python 3.12 classifier. (#10381) 2024-06-04 18:02:59 +08:00
Jiaming Yuan
43a57c4a85
Bump development version to 2.2. (#10376) 2024-06-04 12:59:16 +08:00
Jiaming Yuan
979e392deb
Fix warnings in GPU dask tests. (#10358) 2024-06-04 12:58:58 +08:00
Jiaming Yuan
e6eefea5e2
[coll] Move the rabit poll helper. (#10349) 2024-05-31 08:02:21 +08:00
Philip Hyunsu Cho
324f2d4e4a
Handle float128 generically (#10322) 2024-05-30 20:14:39 +08:00
Jiaming Yuan
a5a58102e5
Revamp the rabit implementation. (#10112)
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
Jiaming Yuan
ba9b4cb1ee
Fix pylint. (#10296) 2024-05-17 13:28:39 +08:00
Jiaming Yuan
ca1d04bcb7
Release data in cache. (#10286) 2024-05-14 14:20:19 +08:00
Jiaming Yuan
d81e319e78
Fixes for the latest pandas. (#10266)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-05-12 11:15:46 +08:00
Jiaming Yuan
73afef1a6e
Fixes for numpy 2.0. (#10252) 2024-05-07 03:54:32 +08:00
Jiaming Yuan
837d44a345
Support more sklearn tags for testing. (#10230) 2024-04-29 06:33:23 +08:00
Jiaming Yuan
54754f29dd
[pyspark] Sort workers by task ID. (#10220) 2024-04-28 18:05:15 +08:00
Philip Hyunsu Cho
edb945d59b
[CI] Use native arm64 worker in GHAction to build M1 wheel (#10225)
* [CI] Use native arm64 worker in GHAction to build M1 wheel

* Set up Conda

* Use mamba

* debug

* fix

* fix

* fix

* fix

* fix

* Temporarily disable other tests

* Fix prefix

* Use micromamba

* Use conda-incubator/setup-miniconda

* Use mambaforge

* Fix

* Fix prefix

* Don't use deprecated set-output

* Add verbose output from build

* verbose

* Specify arch

* Bump setup-miniconda to v3

* Use Python 3.9

* Restore deleted files

* WAR.

---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-04-26 10:16:55 -07:00
Bobby Wang
8fb05c8c95
[pyspark] support stage-level for yarn/k8s (#10209) 2024-04-20 00:24:40 +08:00
Jiaming Yuan
303c603c7d
[pyspark] Reuse the collective communicator. (#10198) 2024-04-18 19:09:30 +08:00
github-actions[bot]
2925cebdca
[CI] Use latest RAPIDS; Pandas 2.0 compatibility fix (#10175)
* [CI] Update RAPIDS to latest stable

* [CI] Use rapidsai stable channel; fix syntax errors in Dockerfile.gpu

* Don't combine astype() with loc()

* Work around https://github.com/dmlc/xgboost/issues/10181

* Fix formatting

* Fix test

---------

Co-authored-by: hcho3 <hcho3@users.noreply.github.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2024-04-15 13:38:53 -07:00
Jiaming Yuan
f0a138f33a
Fix pyspark with verbosity=3. (#10172) 2024-04-09 23:18:56 +08:00
Jiaming Yuan
ca4801f81d
Work with IPv6 in the new tracker. (#10125) 2024-03-20 05:19:23 +08:00
Jiaming Yuan
e14c3b9325
Optional normalization for learning to rank. (#10094) 2024-03-08 12:41:21 +08:00
Bobby Wang
d24df52bb9
[pyspark] rework the log (#10077) 2024-02-29 16:47:31 +08:00
Jiaming Yuan
eb281ff9b4
[CI] Fix JVM tests on GH Action (#10064)
---------

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2024-02-22 14:21:32 -08:00
Jiaming Yuan
8ea705e4d5
Support sample weight in sklearn custom objective. (#10050) 2024-02-21 00:43:14 +08:00
Jiaming Yuan
69a17d5114
Fix with None input. (#10052) 2024-02-20 22:34:22 +08:00
david-cortes
3abbbe41ac
[R] Add data iterator, quantile dmatrix, external memory, and missing feature_types (#9913) 2024-01-30 19:26:44 +08:00
Jiaming Yuan
54b71c8fba
Fix with black 24.1.1. (#10014) 2024-01-30 17:24:11 +08:00
Jiaming Yuan
65d7bf2dfe
Handle np integer in model slice and prediction. (#10007) 2024-01-26 04:58:48 +08:00
Jiaming Yuan
d12cc1090a
Refactor tests for training continuation. (#9997) 2024-01-24 16:07:19 +08:00
Jiaming Yuan
0798e36d73
[breaking] Remove deprecated parameters in the skl interface. (#9986) 2024-01-15 20:40:05 +08:00
Jiaming Yuan
01c4711556
Check __cuda_array_interface__ instead of cupy class. (#9971)
* Now XGBoost can directly consume CUDA data from torch.
2024-01-09 19:59:01 +08:00
Jiaming Yuan
b3eb5d0945
Use UBJ in Python checkpoint. (#9958) 2024-01-09 03:22:15 +08:00
Jiaming Yuan
fa5e2f6c45
Synthesize the AMES housing dataset for tests. (#9963) 2024-01-09 00:54:23 +08:00
Jiaming Yuan
38dd91f491
Save model in ubj as the default. (#9947) 2024-01-05 17:53:36 +08:00
Jiaming Yuan
621348abb3
Fix multi-output with alternating strategies. (#9933)
---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-01-04 16:41:13 +08:00