692 Commits

Author SHA1 Message Date
WeichenXu
176fec8789
PySpark XGBoost integration (#8020)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2022-07-13 13:11:18 +08:00
Jiaming Yuan
a5bc8e2c6a
Fix mypy error with the latest dask. (#8052)
* Fix mypy error with latest dask.

Dask is adding type hints to its codebase and as the result, checks in XGBoost can be
performed more rigorously.

- Remove compatibility with old dask version where multi lock was missing.
- Restrict input of `X` to be non-series.
- Adopt latest definition of `Delayed`.
- Avoid passing optional `host_ip`.
- Avoid deprecated `worker.nthreads`.
2022-07-09 08:02:42 +08:00
Jiaming Yuan
701f32b227
[py-sckl] Raise import error if skl is not installed. (#8049) 2022-07-09 05:56:46 +08:00
Jiaming Yuan
ff1c559084
Remove unused variable. (#8046) 2022-07-05 01:59:22 +08:00
Jiaming Yuan
dcaf580476
Fix Python package source install. (#8036)
* Copy gputreeshap.
2022-06-29 21:45:09 +08:00
Joris LIMONIER
f470ad3af9
Fix multiple typos (#8028)
Fix 4 "graphiz" instead of "graphviz".
2022-06-27 19:21:58 +08:00
Gavin Zhang
6426449c8b
Support IBM i OS (#7920) 2022-06-02 23:38:35 +08:00
Jiaming Yuan
6b55150e80
Fix pylint errors. (#7967) 2022-06-02 18:04:46 +08:00
Gyeongjae Choi
cc6d57aa0d
Add minimal emscripten build support (#7954) 2022-05-30 14:11:40 +08:00
Tim Sabsch
7a039e03fe
Fix incomplete type hints for verbose (#7945) 2022-05-30 12:08:24 +08:00
Jiaming Yuan
d314680a15
Verify shared object version at load. (#7928) 2022-05-23 20:53:30 +08:00
Jiaming Yuan
f93a727869
Address remaining mypy errors in python package. (#7914) 2022-05-18 22:46:15 +08:00
Chengyang
806c92c80b
Add Type Hints for Python Package (#7742)
Co-authored-by: Chengyang Gu <bridgream@gmail.com>
Co-authored-by: Jiamingy <jm.yuan@outlook.com>
2022-05-17 22:14:09 +08:00
Rong Ou
77d4a53c32
use RabitContext intead of init/finalize (#7911) 2022-05-17 12:15:41 +08:00
Rong Ou
af907e2d0d
Demo of federated learning using NVFlare (#7879)
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2022-05-14 22:45:41 +08:00
Jiaming Yuan
c8f9d4b6e6
Show libxgboost.so path in build info. (#7893) 2022-05-13 18:08:56 +08:00
Jiaming Yuan
db80671d6b
Fix monotone constraint with tuple input. (#7891) 2022-05-13 04:00:03 +08:00
Jiaming Yuan
8ba4722d04
Remove pyarrow workaround. (#7884) 2022-05-11 20:54:48 +08:00
Rong Ou
14ef38b834
Initial support for federated learning (#7831)
Federated learning plugin for xgboost:
* A gRPC server to aggregate MPI-style requests (allgather, allreduce, broadcast) from federated workers.
* A Rabit engine for the federated environment.
* Integration test to simulate federated learning.

Additional followups are needed to address GPU support, better security, and privacy, etc.
2022-05-05 21:49:22 +08:00
Jiaming Yuan
ad06172c6b
Refactor pandas dataframe handling. (#7843) 2022-04-26 18:53:43 +08:00
Jiaming Yuan
f0f76259c9
Remove STRING_TYPES. (#7827) 2022-04-22 19:07:51 +08:00
Jiaming Yuan
c70fa502a5
Expose feature_types to sklearn interface. (#7821) 2022-04-21 20:23:35 +08:00
Jiaming Yuan
52d4eda786
Deprecate use_label_encoder in XGBClassifier. (#7822)
* Deprecate `use_label_encoder` in XGBClassifier.

* We have removed the encoder, now prepare to remove the indicator.
2022-04-21 13:14:02 +08:00
Jiaming Yuan
bcce17e688
Remove text loading in basic walk through demo. (#7753) 2022-04-01 00:59:42 +08:00
Jiaming Yuan
02dd7b6913
Remove use of distutils. (#7770)
distutils is deprecated and replaced by other stdlib constructs.
2022-03-31 19:03:10 +08:00
Jiaming Yuan
522636cb52
Bump version. (#7769) 2022-03-31 06:33:22 +08:00
Jiaming Yuan
9150fdbd4d
Support pandas nullable types. (#7760) 2022-03-30 08:51:52 +08:00
Jiaming Yuan
a50b84244e
Cleanup configuration for constraints. (#7758) 2022-03-29 04:22:46 +08:00
Jiaming Yuan
3c9b04460a
Move num_parallel_tree to model parameter. (#7751)
The size of forest should be a property of model itself instead of a training
hyper-parameter.
2022-03-29 02:32:42 +08:00
Jiaming Yuan
b3ba0e8708
Check cupy lazily. (#7752) 2022-03-26 06:09:58 +08:00
Chengyang
c92ab2ce49
Add type hints to core.py (#7707)
Co-authored-by: Chengyang Gu <bridgream@gmail.com>
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2022-03-23 21:12:14 +08:00
Xiaochang Wu
613ec36c5a
Support building SimpleDMatrix from Arrow data format (#7512)
* Integrate with Arrow C data API.
* Support Arrow dataset.
* Support Arrow table.

Co-authored-by: Xiaochang Wu <xiaochang.wu@intel.com>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
Co-authored-by: Zhang Zhang <zhang.zhang@intel.com>
2022-03-15 13:25:19 +08:00
Jiaming Yuan
a62a3d991d
[dask] prediction with categorical data. (#7708) 2022-03-10 00:21:48 +08:00
Pradipta Ghosh
68b6d6bbe2
Fix for Feature shape mismatch error (#7715) 2022-03-03 21:36:29 +08:00
Cheng Li
a92e0f6240
multi groups in the constraints (#7711) 2022-03-01 18:10:15 +08:00
Jiaming Yuan
83a66b4994
Support categorical data for hist. (#7695)
* Extract partitioner from hist.
* Implement categorical data support by passing the gradient index directly into the partitioner.
* Organize/update document.
* Remove code for negative hessian.
2022-02-25 03:47:14 +08:00
Jiaming Yuan
c859764d29
[doc] Clarify that states in callbacks are mutated. (#7685)
* Fix copy for cv.  This prevents inserting default callbacks into the input list.
* Clarify the behavior of callbacks in training/cv.
* Fix typos in doc.
2022-02-22 11:45:00 +08:00
Jiaming Yuan
e56d1779e1
Require Python 3.7. (#7682)
* Update setup.py.
2022-02-21 05:46:48 +08:00
Jiaming Yuan
f08c5dcb06
Cleanup some pylint errors. (#7667)
* Cleanup some pylint errors.

* Cleanup pylint errors in rabit modules.
* Make data iter an abstract class and cleanup private access.
* Cleanup no-self-use for booster.
2022-02-19 18:53:12 +08:00
Jiaming Yuan
b76c5d54bf
Define export symbols in callback module. (#7665) 2022-02-19 18:52:41 +08:00
Jiaming Yuan
0d0abe1845
Support optimal partitioning for GPU hist. (#7652)
* Implement `MaxCategory` in quantile.
* Implement partition-based split for GPU evaluation.  Currently, it's based on the existing evaluation function.
* Extract an evaluator from GPU Hist to store the needed states.
* Added some CUDA stream/event utilities.
* Update document with references.
* Fixed a bug in approx evaluator where the number of data points is less than the number of categories.
2022-02-15 03:03:12 +08:00
Jiaming Yuan
5cd1f71b51
[dask] Improve configuration for port. (#7645)
- Try port 0 to let the OS return the available port.
- Add port configuration.
2022-02-14 21:34:34 +08:00
Jiaming Yuan
b52c4e13b0
[dask] Fix empty partition with pandas input. (#7644)
Empty partition is different from empty dataset.  For the former case, each worker has
non-empty dask collections, but each collection might contain empty partition.
2022-02-14 19:35:51 +08:00
Jiaming Yuan
fe4ce920b2
[dask] Cleanup dask module. (#7634)
* Add a new utility for mapping function onto workers.
* Unify the type for feature names.
* Clean up the iterator.
* Fix prediction with DaskDMatrix worker specification.
* Fix base margin with DeviceQuantileDMatrix.
* Support vs 2022 in setup.py.
2022-02-08 20:41:46 +08:00
Jiaming Yuan
926af9951e
Add missing train parameter for sklearn interface. (#7629)
Some other parameters are still missing and rely on **kwargs, for instance parameters from
dart.
2022-02-08 13:20:19 +08:00
Jiaming Yuan
3e693e4f97
[dask] Fix nthread config with dask sklearn wrapper. (#7633) 2022-02-08 06:38:32 +08:00
Philip Hyunsu Cho
f6e6d0b2c0
[CI] Build Python wheels for MacOS (x86_64 and arm64) (#7621)
* Build Python wheels for OSX (x86_64 and arm64)

* Use Conda's libomp when running Python tests

* fix

* Add comment to explain CIBW_TARGET_OSX_ARM64

* Update release script

* Add comments in build_python_wheels.sh

* Document wheel pipeline
2022-02-02 17:35:48 -08:00
Philip Hyunsu Cho
b4340abf56
Add special handling for multi:softmax in sklearn predict (#7607)
* Add special handling for multi:softmax in sklearn predict

* Add test coverage
2022-01-29 15:54:49 -08:00
Jiaming Yuan
24789429fd
Support latest pandas Index type. (#7595) 2022-01-26 18:20:10 +08:00
Jiaming Yuan
f84291c1e1
Fix max_cat_to_onehot doc annotation [skip ci] (#7592) 2022-01-23 16:33:23 +08:00