251 Commits

Author SHA1 Message Date
Jiaming Yuan
0715ab3c10
Use dlopen to load NCCL. (#9796)
This PR adds optional support for loading nccl with `dlopen` as an alternative of compile time linking. This is to address the size bloat issue with the PyPI binary release.
- Add CMake option to load `nccl` at runtime.
- Add an NCCL stub.

After this, `nccl` will be fetched from PyPI when using pip to install XGBoost, either by a user or by `pyproject.toml`. Others who want to link the nccl at compile time can continue to do so without any change.

At the moment, this is Linux only since we only support MNMG on Linux.
2023-11-22 19:27:31 +08:00
Jiaming Yuan
82828621d0
[doc] Add doc for linters and simplify c++ lint script. (#9750) 2023-11-07 05:03:30 +08:00
Jiaming Yuan
3ca06ac51e
[doc] Mention data consistency for categorical features. (#9678) 2023-10-24 10:11:33 +08:00
Philip Hyunsu Cho
5e6cb63a56
[CI] Set up CI for Mac M1 (#9699) 2023-10-22 23:33:19 -07:00
Rong Ou
6fbe6248f4
More in-memory input support for column split (#9685) 2023-10-20 16:02:36 +08:00
Chuck Atkins
83cdf14b2c
CMake LTO and CUDA arch (#9677) 2023-10-20 13:01:37 +08:00
Philip Hyunsu Cho
a5e07a01f8
[CI] Pull CentOS 7 images from NGC (#9666) 2023-10-13 12:11:54 +08:00
github-actions[bot]
d1dee4ad99
[CI] Update RAPIDS to latest stable (#9654)
* [CI] Update RAPIDS to latest stable

* Remove slashes from Docker tag

---------

Co-authored-by: hcho3 <hcho3@users.noreply.github.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2023-10-11 23:26:09 -07:00
James Lamb
51e32e4905
[CI] add cmakelint to C++ linting task (#9641)
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-11 16:04:10 +08:00
Jiaming Yuan
4e5a7729c3
Fix lint errors. (#9634) 2023-10-09 19:04:31 +08:00
James Lamb
799f8485e2
[R] [CI] enforce lintr::function_left_parentheses_linter check (#9631) 2023-10-08 09:42:09 +08:00
Jiaming Yuan
cac2cd2e94
[R] Set number of threads in demos and tests. (#9591)
- Restrict the number of threads in IO.
- Specify the number of threads in demos and tests.
- Add helper scripts for checks.
2023-09-23 21:44:03 +08:00
Jiaming Yuan
e6cf7a1278
Deprecate the command line interface. (#9485)
---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2023-08-21 06:47:48 +08:00
Jiaming Yuan
db87d481bc
[R] Differentiate dev version with release version. (#9503)
Use 2.1.0.0 as development version, we will change it to 2.1.0.1 during release.
2023-08-20 02:58:58 +08:00
github-actions[bot]
f03463c45b
[CI] Update RAPIDS to latest stable (#9464)
* [CI] Update RAPIDS to latest stable

* [CI] Use CMake 3.26.4

---------

Co-authored-by: hcho3 <hcho3@users.noreply.github.com>
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2023-08-13 18:54:37 -07:00
James Lamb
4359356d46
[R] [CI] use lintr 3.1.0 (#9456) 2023-08-10 17:49:16 +08:00
Jiaming Yuan
f05a23b41c
Use weakref instead of id for DataIter cache. (#9445)
- Fix case where Python reuses id from freed objects.
- Small optimization to column matrix with QDM by using `realloc` instead of copying data.
2023-08-10 00:40:06 +08:00
Jiaming Yuan
c1b2cff874
[CI] Check compiler warnings. (#9444) 2023-08-08 12:02:45 -07:00
Philip Hyunsu Cho
7ce090e775
Handle UTF-8 paths correctly on Windows platform (#9443)
* Fix round-trip serialization with UTF-8 paths

* Add compiler version check

* Add comment to C API functions

* Add Python tests

* [CI] Updatre MacOS deployment target

* Use std::filesystem instead of dmlc::TemporaryDirectory
2023-08-07 23:27:25 -07:00
Jiaming Yuan
851cba931e
Define best_iteration only if early stopping is used. (#9403)
* Define `best_iteration` only if early stopping is used.

This is the behavior specified by the document but not honored in the actual code.

- Don't set the attributes if there's no early stopping.
- Clean up the code for callbacks, and replace assertions with proper exceptions.
- Assign the attributes when early stopping `save_best` is used.
- Turn the attributes into Python properties.

---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2023-07-24 12:43:35 +08:00
Jiaming Yuan
275da176ba
Document for device ordinal. (#9398)
- Rewrite GPU demos. notebook is converted to script to avoid committing additional png plots.
- Add GPU demos into the sphinx gallery.
- Add RMM demos into the sphinx gallery.
- Test for firing threads with different device ordinals.
2023-07-22 15:26:29 +08:00
Philip Hyunsu Cho
e082718c66
[CI] Build pip wheel with RMM support (#9383) 2023-07-18 01:52:26 -07:00
Jiaming Yuan
16eb41936d
Handle the new device parameter in dask and demos. (#9386)
* Handle the new `device` parameter in dask and demos.

- Check no ordinal is specified in the dask interface.
- Update demos.
- Update dask doc.
- Update the condition for QDM.
2023-07-15 19:11:20 +08:00
Jiaming Yuan
04aff3af8e
Define the new device parameter. (#9362) 2023-07-13 19:30:25 +08:00
Jiaming Yuan
39390cc2ee
[breaking] Remove the predictor param, allow fallback to prediction using DMatrix. (#9129)
- A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter.
- The `predictor` parameter is removed.
- Fallback to `DMatrix` when `inplace_predict` is not available.
- The heuristic for choosing a predictor is only used during training.
2023-07-03 19:23:54 +08:00
Jiaming Yuan
f4798718c7
Use hist as the default tree method. (#9320) 2023-06-27 23:04:24 +08:00
Jiaming Yuan
cfa9c42eb4
Fix callback in AFT viz demo. (#9333)
* Fix callback in AFT viz demo.

- Update the callback function.
- Add lint check.
2023-06-26 22:35:02 +08:00
Jiaming Yuan
1fcc26a6f8
Set ndcg to default for LTR. (#8822)
- Add document.
- Add tests.
- Use `ndcg` with `topk` as default.
2023-06-09 23:31:33 +08:00
Boris
7f9cb921f4
Rearranged maven profiles so that scala-2.13 artifacts are published without gpu-related libraries (#9253) 2023-06-05 13:52:10 -07:00
Jiaming Yuan
9fbde21e9d
Rework the precision metric. (#9222)
- Rework the precision metric for both CPU and GPU.
- Mention it in the document.
- Cleanup old support code for GPU ranking metric.
- Deterministic GPU implementation.

* Drop support for classification.

* type.

* use batch shape.

* lint.

* cpu build.

* cpu build.

* lint.

* Tests.

* Fix.

* Cleanup error message.
2023-06-02 20:49:43 +08:00
Philip Hyunsu Cho
db8288121d
Revert "Publishing scala-2.13 artifacts to the maven S3 repo. (#9224)" (#9233)
This reverts commit bb2a17b90c4f212191932dce3453591bd7f47cfc.
2023-06-01 14:39:39 -07:00
Boris
bb2a17b90c
Publishing scala-2.13 artifacts to the maven S3 repo. (#9224) 2023-06-01 10:45:18 -07:00
Jiaming Yuan
62e9387cd5
[ci] Update PySpark version. (#9214) 2023-05-31 03:00:44 +08:00
Boris
a01df102c9
Scala 2.13 support. (#9099)
1. Updated the test logic
2. Added smoke tests for Spark examples.
3. Added integration tests for Spark with Scala 2.13
2023-05-27 19:34:02 +08:00
Jiaming Yuan
8c174ef2d3
[CI] Update images that are not related to binary release. (#9205)
* [CI] Update images that are not related to the binary release.

- Update clang-tidy, prefer tools from the Ubuntu repository.
- Update GPU image to 22.04.
- Small cleanup to the tidy script.
- Remove gpu_jvm, which seems to be unused.
2023-05-27 17:40:46 +08:00
Uriya Harpeness
a075aa24ba
Move python tool configurations to pyproject.toml, and add the python 3.11 classifier. (#9112) 2023-05-06 02:59:06 +08:00
Philip Hyunsu Cho
a5cd2412de
Replace setup.py with pyproject.toml (#9021)
* Create pyproject.toml
* Implement a custom build backend (see below) in packager directory. Build logic from setup.py has been refactored and migrated into the new backend.
* Tested: pip wheel . (build wheel), python -m build --sdist . (source distribution)
2023-04-20 13:51:39 -07:00
Jiaming Yuan
bac22734fb
Remove ntree limit in python package. (#8345)
- Remove `ntree_limit`. The parameter has been deprecated since 1.4.0.
- The SHAP package compatibility is broken.
2023-03-31 19:01:55 +08:00
Philip Hyunsu Cho
6676c28cbc
[CI] Fix Windows wheel to be compatible with Poetry (#8991)
* [CI] Fix Windows wheel to be compatible with Poetry

* Typo

* Eagerly scan globs to avoid patching same file twice
2023-03-28 21:32:54 -07:00
Jiaming Yuan
401ce5cf5e
Run linters with the multi output demo. (#8966) 2023-03-28 00:47:28 +08:00
Jiaming Yuan
151882dd26
Initial support for multi-target tree. (#8616)
* Implement multi-target for hist.

- Add new hist tree builder.
- Move data fetchers for tests.
- Dispatch function calls in gbm base on the tree type.
2023-03-22 23:49:56 +08:00
Jiaming Yuan
f7ce0ec0df
Upgrade gcc toolchain to 9.x. (#8878)
* Use new tool chain.

* Use gcc-9.

* Use cmake from system.

* DOn't link leak.
2023-03-07 08:25:23 -08:00
Jiaming Yuan
6a892ce281
Specify src path for isort. (#8867) 2023-03-06 17:30:27 +08:00
Jiaming Yuan
4d665b3fb0
Restore clang tidy test. (#8861) 2023-03-03 13:47:04 -08:00
Philip Hyunsu Cho
6d8afb2218
[CI] Require C++17 + CMake 3.18; Use CUDA 11.8 in CI (#8853)
* Update to C++17

* Turn off unity build

* Update CMake to 3.18

* Use MSVC 2022 + CUDA 11.8

* Re-create stack for worker images

* Allocate more disk space for Windows

* Tempiorarily disable clang-tidy

* RAPIDS now requires Python 3.10+

* Unpin cuda-python

* Use latest NCCL

* Use Ubuntu 20.04 in RMM image

* Mark failing mgpu test as xfail
2023-03-01 09:22:24 -08:00
Jiaming Yuan
cce4af4acf
Initial support for quantile loss. (#8750)
- Add support for Python.
- Add objective.
2023-02-16 02:30:18 +08:00
Jiaming Yuan
7b3d473593
[doc] Add demo for inference using individual tree. (#8752) 2023-02-07 04:40:18 +08:00
Jiaming Yuan
0f37a01dd9
Require black formatter for the python package. (#8748) 2023-02-07 01:53:33 +08:00
Philip Hyunsu Cho
dd79ab846f
[CI] Fix failing arm build (#8751)
* Always install Conda env into /opt/python; use Mamba

* Change ownership of Conda env to buildkite-agent user

* Use unique name

* Fix
2023-02-03 22:32:48 -08:00
James Lamb
0d8248ddcd
[R] discourage use of regex for fixed string comparisons (#8736) 2023-01-30 18:47:21 +08:00