7091 Commits

Author SHA1 Message Date
Dmitry Razdoburdin
d7599e095b
[SYCL] Add dask support for distributed (#10812) 2024-09-22 02:01:57 +08:00
Jiaming Yuan
2a37a8880c
Check correct dump format for gblinear. (#10831) 2024-09-21 00:32:52 +08:00
Jiaming Yuan
24241ed6e3
[EM] Compress dense ellpack. (#10821)
This helps reduce the memory copying needed for dense data. In addition, it helps reduce memory usage even if external memory is not used.

- Decouple the number of symbols needed in the compressor with the number of features when the data is dense.
- Remove the fetch call in the `at_end_` iteration.
- Reduce synchronization and kernel launches by using the `uvector` and ctx.
2024-09-20 18:20:56 +08:00
Jiaming Yuan
d5e1c41b69
[coll] Use loky for rabit op tests. (#10828) 2024-09-20 16:46:05 +08:00
Valentin Waeselynck
15c6172e09
[doc] Improve the model introduction. (#10822) 2024-09-19 02:33:49 +08:00
Jiaming Yuan
96bbf80457
[EM] Suport quantile objectives for GPU-based external memory. (#10820)
- Improved error message for memory usage.
- Support quantile-based objectives for GPU external memory.
2024-09-17 13:27:02 +08:00
shlomota
de00e07087
Fix misleading error when feature names are missing during inference (#10814) 2024-09-13 23:30:50 +08:00
Bobby Wang
67c8c96784
[jvm-packages] [breaking] rework xgboost4j-spark and xgboost4j-spark-gpu (#10639)
- Introduce an abstract XGBoost Estimator
- Update to the latest XGBoost parameters
  - Add all XGBoost parameters supported in XGBoost4j-spark.
  - Add setter and getter for these parameters.
  - Remove the deprecated parameters
- Address the missing value handling
- Remove any ETL operations in XGBoost
- Rework the GPU plugin
- Expand sanity tests for CPU and GPU consistency
2024-09-11 15:54:19 +08:00
Jiaming Yuan
d94f6679fc
[EM] Avoid synchronous calls and unnecessary ATS access. (#10811)
- Pass context into various functions.
- Factor out some CUDA algorithms.
- Use ATS only for update position.
2024-09-10 14:33:14 +08:00
Jiaming Yuan
ed5f33df16
[EM] Multi-level quantile sketching for GPU. (#10813) 2024-09-10 13:08:34 +08:00
Jiaming Yuan
3ef8383d93
[doc] Fix custom_metric_obj.rst [skip ci] (#10796) (#10815)
Added the square to the derivative in the hessian

Co-authored-by: Corentin Santos <corentin.santos@iphc.cnrs.fr>
2024-09-10 05:11:43 +08:00
Dmitry Razdoburdin
bba6aa74fb
[SYCL] Fix for sycl support with sklearn estimators (#10806)
---------

Co-authored-by: Dmitry Razdoburdin <>
2024-09-09 14:14:07 +08:00
Jiaming Yuan
5f7f31d464
[EM] Refactor ellpack construction. (#10810)
- Remove the calculation of n_symbols in the accessor.
- Pack initialization steps into the parameter list.
- Pass the context into various ctors.
- Specialization for dense data to prepare for further compression.
2024-09-09 14:10:10 +08:00
dependabot[bot]
c69c4adb58
Bump actions/setup-python from 5.1.1 to 5.2.0 (#10768)
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.1.1 to 5.2.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](39cd14951b...f677139bbe)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-08 00:09:22 +08:00
david-cortes
f52f11e1d7
[R] Allow passing data.frame to SHAP (#10744) 2024-09-02 19:44:12 +08:00
dependabot[bot]
ec8cfb3267
Bump actions/upload-artifact from 4.3.4 to 4.4.0 (#10770)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.4 to 4.4.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](0b2256b8c0...50769540e7)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 17:52:32 +08:00
david-cortes
15b72571f3
[R] update serialization advise for new xgboost class (#10794) 2024-09-02 02:46:11 +08:00
dependabot[bot]
4f88ada219
Bump actions/setup-java from 4.2.1 to 4.2.2 (#10769)
Bumps [actions/setup-java](https://github.com/actions/setup-java) from 4.2.1 to 4.2.2.
- [Release notes](https://github.com/actions/setup-java/releases)
- [Commits](99b8673ff6...6a0805fcef)

---
updated-dependencies:
- dependency-name: actions/setup-java
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 02:35:48 +08:00
Samuel Marks
4503555274
POSIX compliant poll.h and mmap over sys/poll.h and mmap64 (#10767) 2024-09-01 15:47:30 +08:00
Jiaming Yuan
e1a2c1bbb3
[EM] Merge GPU partitioning with histogram building. (#10766)
- Stop concatenating pages if there's no subsampling.
- Use a single iteration for histogram build and partitioning.
2024-08-31 03:25:37 +08:00
Jiaming Yuan
98ac153265
Avoid warning from NVCC. (#10757) 2024-08-30 16:11:31 +08:00
Jiaming Yuan
5cc7c735e5
Don't link gputreeshap. (#10758)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-08-30 14:40:58 +08:00
Jiaming Yuan
34d4ab455e
[EM] Avoid stream sync in quantile sketching. (#10765)
.
2024-08-30 12:33:24 +08:00
Jiaming Yuan
61dd854a52
[EM] Refactor GPU histogram builder. (#10764)
- Expose the maximum number of cached nodes to be consistent with the CPU implementation. Also easier for testing.
- Extract the subtraction trick for easier testing.
- Split up the `GradientQuantiser` to avoid circular dependency.
2024-08-30 02:39:14 +08:00
Jiaming Yuan
34937fea41
[EM] Python wrapper for the ExtMemQuantileDMatrix. (#10762)
Not exposed to the document yet.

- Add C API.
- Add Python API.
- Basic CPU tests.
2024-08-29 04:08:25 +08:00
Jiaming Yuan
7510a87466
[EM] Reuse the quantile container. (#10761)
Use the push method to merge the quantiles instead of creating multiple containers. This
reduces the memory usage by consistent pruning.
2024-08-29 01:39:55 +08:00
Jiaming Yuan
4fe67f10b4
[EM] Have one partitioner for each batch. (#10760)
- Initialize one partitioner for each batch.
- Collect partition size during initialization.
- Support base ridx in the finalization.
2024-08-29 01:35:17 +08:00
david-cortes
3043827efc
[R] Update vignette "XGBoost presentation" (#10749)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-08-28 16:22:54 +08:00
Philip Hyunsu Cho
7794d3da8a
Ensure that pip check does not fail due to bad platform tag (#10755)
* Remove custom tag generation

* Revert "Remove custom tag generation"

This reverts commit fe3cf0e8786c7dc05e1deced3a1c92cd79094735.

* Fetch an accurate platform tag from Pip 22+

* Fix formatting

* TOML allows trailing commas

* Update patch

* Add trailing comma

* Fix up patch

* Use `packaging`

Co-authored-by: jakirkham <jakirkham@gmail.com>

---------

Co-authored-by: jakirkham <jakirkham@gmail.com>
2024-08-27 18:11:08 -07:00
Jiaming Yuan
64afe9873b
Increase timeout in C++ tests from 1 to 5 seconds. (#10756)
To avoid CI failures on FreeBSD.
2024-08-28 02:27:14 +08:00
Jiaming Yuan
bde1265caf
[EM] Return a full DMatrix instead of a Ellpack from the GPU sampler. (#10753) 2024-08-28 01:05:11 +08:00
Jiaming Yuan
d6ebcfb032
[EM] Support CPU quantile objective for external memory. (#10751) 2024-08-27 04:16:57 +08:00
david-cortes
12c6b7ceea
[R] Remove demos (#10750) 2024-08-27 04:16:36 +08:00
Jiaming Yuan
06c4246ff1
[CI] Workaround mypy errors. (#10754) 2024-08-27 02:54:11 +08:00
Jiaming Yuan
25966e4ba8
[EM] Pass batch parameter into extmem format. (#10736)
- Allow customization for format reading.
- Customize the number of pre-fetch batches.
2024-08-27 02:37:50 +08:00
Michael Mayer
074cad2343
[R] Finalizes switch to markdown doc (#10733)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-08-27 01:25:06 +08:00
david-cortes
479ae8081b
[R] Add class names to coefficients (#10745) 2024-08-25 04:41:58 +08:00
Jiaming Yuan
fd0138c91c
[coll] Improve column split tests with named threads. (#10735) 2024-08-24 12:43:47 +08:00
Jiaming Yuan
55aef8f546
[EM] Avoid resizing host cache. (#10734)
* [EM] Avoid resizing host cache.

- Add SAM allocator and resource.
- Use page-based cache instead of stream-based cache.
2024-08-23 06:34:01 +08:00
James Lamb
dbfafd8557
[doc] Install the conda GPU variant in environments without CUDA (#10731) 2024-08-22 19:48:15 +08:00
Philip Hyunsu Cho
cd83fe6033
[breaking][CI] Use CTK 12.4 (#10697) 2024-08-21 19:59:34 -07:00
Jiaming Yuan
142bdc73ec
[EM] Support SHAP contribution with QDM. (#10724)
- Add GPU support.
- Add external memory support.
- Update the GPU tree shap.
2024-08-22 05:25:10 +08:00
Jiaming Yuan
cb54374550
Update clang-tidy. (#10730)
- Install cmake using pip.
- Fix compile command generation.
- Clean up the tidy script and remove the need to load the yaml file.
- Fix modernized type traits.
- Fix span class. Polymorphism support is dropped
2024-08-22 04:12:18 +08:00
James Lamb
03bd1183bc
[doc] prefer 'cmake -B' and 'cmake --build' everywhere (#10717) 2024-08-22 02:16:55 +08:00
Dmitry Razdoburdin
24d225c1ab
[SYCL] Implement UpdatePredictionCache and connect updater with leraner. (#10701)
---------

Co-authored-by: Dmitry Razdoburdin <>
2024-08-22 02:07:44 +08:00
Jiaming Yuan
9b88495840
[multi] Implement weight feature importance. (#10700) 2024-08-22 02:06:47 +08:00
Jiaming Yuan
402e7837fb
Fix potential race in feature constraint. (#10719) 2024-08-21 16:50:31 +08:00
david-cortes
e9f1abc1f0
[R] keep row names in predictions (#10727) 2024-08-21 05:49:02 +08:00
david-cortes
adf87b27c5
[doc] Fix tutorial for advanced objectives (#10725) 2024-08-21 02:52:50 +08:00
Jiaming Yuan
508ac13243
Check cub errors. (#10721)
- Make sure cuda error returned by cub scan is caught.
- Avoid temporary buffer allocation in thrust device vector.
2024-08-21 02:50:26 +08:00