290 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
ad1a527709
Enable loading model from <1.0.0 trained with objective='binary:logitraw' (#6517)
* Enable loading model from <1.0.0 trained with objective='binary:logitraw'

* Add binary:logitraw in model compatibility testing suite

* Feedback from @trivialfis: Override ProbToMargin() for LogisticRaw

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-12-16 16:53:46 -08:00
Jiaming Yuan
c5876277a8
Drop saving binary format for memory snapshot. (#6513) 2020-12-17 00:14:57 +08:00
Jiaming Yuan
347f593169
Accept numpy array for DMatrix slice index. (#6368) 2020-12-16 14:42:52 +08:00
Jiaming Yuan
ef4a0e0aac
Fix DMatrix feature names/types IO. (#6507)
* Fix feature names/types IO

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-12-16 14:24:27 +08:00
Jiaming Yuan
3c3f026ec1
Move metric configuration into booster. (#6504) 2020-12-16 05:35:04 +08:00
ShvetsKS
8139849ab6
Fix handling of print period in EvaluationMonitor (#6499)
Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>
2020-12-15 19:20:19 +08:00
Jiaming Yuan
a30461cf87
[dask] Support all parameters in regressor and classifier. (#6471)
* Add eval_metric.
* Add callback.
* Add feature weights.
* Add custom objective.
2020-12-14 07:35:56 +08:00
Jiaming Yuan
47b86180f6
Don't validate feature when number of rows is 0. (#6472) 2020-12-07 18:08:51 +08:00
Jiaming Yuan
d6386e45e8
Fix filtering callable objects in skl xgb param. (#6466)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-12-05 17:20:36 +08:00
Philip Hyunsu Cho
fb56da5e8b
Add global configuration (#6414)
* Add management functions for global configuration: XGBSetGlobalConfig(), XGBGetGlobalConfig().
* Add Python interface: set_config(), get_config(), and config_context().
* Add unit tests for Python
* Add R interface: xgb.set.config(), xgb.get.config()
* Add unit tests for R

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-12-03 00:05:18 -08:00
Jiaming Yuan
927c316aeb
Fix period in evaluation monitor. (#6441) 2020-11-29 03:18:33 +08:00
Jiaming Yuan
f4ff1c53fd
Fix CLI ranking demo. (#6439)
Save model at final round.
2020-11-29 03:12:06 +08:00
Jiaming Yuan
2ce2a1a4d8
[SKL] Propagate parameters to booster during set_param. (#6416) 2020-11-20 20:37:35 +08:00
Philip Hyunsu Cho
9c9070aea2
Use pytest conventions consistently (#6337)
* Do not derive from unittest.TestCase (not needed for pytest)

* assertRaises -> pytest.raises

* Simplify test_empty_dmatrix with test parametrization

* setUpClass -> setup_class, tearDownClass -> teardown_class

* Don't import unittest; import pytest

* Use plain assert

* Use parametrized tests in more places

* Fix test_gpu_with_sklearn.py

* Put back run_empty_dmatrix_reg / run_empty_dmatrix_cls

* Fix test_eta_decay_gpu_hist

* Add parametrized tests for monotone constraints

* Fix test names

* Remove test parametrization

* Revise test_slice to be not flaky
2020-11-19 17:00:15 -08:00
Jiaming Yuan
4ccf92ea34
[dask] Fix union of workers. (#6375) 2020-11-13 16:55:05 +08:00
Jiaming Yuan
fcfeb4959c
Deprecate positional arguments. (#6365)
Deprecate positional arguments in following functions:

- `__init__` for all classes in sklearn module.
- `fit` method for all classes in sklearn module.
- dask interface.
- `set_info` for `DMatrix` class.

Refactor the evaluation matrices handling.
2020-11-13 11:10:30 +08:00
Philip Hyunsu Cho
e5193c21a1
[dask] Allow empty data matrix in AFT survival (#6379)
* [dask] Allow empty data matrix in AFT survival

* Add unit test
2020-11-12 17:49:58 -08:00
Jiaming Yuan
c90f968d92
Update Python documents. (#6376) 2020-11-12 17:51:32 +08:00
Jiaming Yuan
6e12c2a6f8
[dask] Supoort running on GKE. (#6343)
* Avoid accessing `scheduler_info()['workers']`.
* Avoid calling `client.gather` inside task.
* Avoid using `client.scheduler_address`.
2020-11-11 18:04:34 +08:00
Jiaming Yuan
8a17610666
Implement GPU predict leaf. (#6187) 2020-11-11 17:33:47 +08:00
Jiaming Yuan
184e2eac7d
Add period to evaluation monitor. (#6348) 2020-11-10 07:47:48 +08:00
Jiaming Yuan
2cc9662005
Support slicing tree model (#6302)
This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone.

* Implement the save_best option in early stopping.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-11-02 23:27:39 -08:00
Jiaming Yuan
7756192906
[dask] Fix prediction on DaskDMatrix with multiple meta data. (#6333)
* Unify the meta handling methods.
2020-11-02 19:18:44 -05:00
Jiaming Yuan
6ff331b705
Fix Python callback. (#6320) 2020-10-30 05:03:44 +08:00
Jiaming Yuan
c80657b542
Fix flaky data initialization test. (#6318) 2020-10-30 03:11:22 +08:00
Jiaming Yuan
dfac5f89e9
Group CLI demo into subdirectory. (#6258)
CLI is not most developed interface. Putting them into correct directory can help new users to avoid it as most of the use cases are from a language binding.
2020-10-28 14:40:44 -07:00
Philip Hyunsu Cho
143b278267
Mark flaky tests as XFAIL (#6299)
* Temporarily skip TestGPUUpdaters::test_categorical

* Temporarily skip test_boost_from_prediction[approx]
2020-10-28 11:50:57 -07:00
Jiaming Yuan
3310e208fd
Fix inplace prediction interval. (#6259)
* Add back the interval in call.
* Make the interval non-optional.
2020-10-28 13:13:59 +08:00
Philip Hyunsu Cho
c8ec62103a
Deprecate LabelEncoder in XGBClassifier; Enable cuDF/cuPy inputs in XGBClassifier (#6269)
* Deprecate LabelEncoder in XGBClassifier; skip LabelEncoder for cuDF/cuPy inputs

* Add unit tests for cuDF and cuPy inputs with XGBClassifier

* Fix lint

* Clarify warning

* Move use_label_encoder option to XGBClassifier constructor

* Add a test for cudf.Series

* Add use_label_encoder to XGBRFClassifier doc

* Address reviewer feedback
2020-10-26 13:20:51 -07:00
Jiaming Yuan
bcfab4d726
Revert "Disable JSON full serialization for now. (#6248)" (#6266)
This reverts commit 6d293020fbfa2c67b532d550fe5d55689662caac.
2020-10-27 03:30:47 +08:00
Jiaming Yuan
2686d32a36
Skip dask tests on ARM. (#6267)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-26 15:09:05 +08:00
Philip Hyunsu Cho
677f676172
Use UserWarning for old callback, as DeprecationWarning is not visible (#6270) 2020-10-22 01:10:52 -07:00
Philip Hyunsu Cho
1300467d36
Fix a typo in is_arm() in testing.py [skip ci] (#6271) 2020-10-22 13:07:14 +08:00
Jiaming Yuan
81c37c28d5
Time the CPU tests on Jenkins. (#6257)
* Time the CPU tests on Jenkins.
* Reduce thread contention.
* Add doc.
* Skip heavy tests on ARM.
2020-10-20 17:19:07 -07:00
Jiaming Yuan
6d293020fb
Disable JSON full serialization for now. (#6248)
* Disable JSON serialization for now.

* Multi-class classification is checkpointing for each iteration.
This brings significant overhead.

Revert: 90355b4f007ae

* Set R tests to use binary.
2020-10-16 17:59:54 +08:00
Jiaming Yuan
3da5a69dc9
Fix typo in dask interface. (#6240) 2020-10-15 15:26:29 +08:00
Jiaming Yuan
b05073bda5
[dask] Test for data initializaton. (#6226) 2020-10-13 11:08:35 +08:00
Jiaming Yuan
2443275891
Cleanup Python code. (#6223)
* Remove pathlike as XGBoost 1.2 requires Python 3.6.
* Move conditional import of dask/distributed into dask module.
2020-10-12 15:44:41 +08:00
Jiaming Yuan
ab5b35134f
Rework Python callback functions. (#6199)
* Define a new callback interface for Python.
* Deprecate the old callbacks.
* Enable early stopping on dask.
2020-10-10 17:52:36 +08:00
Christian Lorentzen
cf4f019ed6
[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183)
* Change DefaultEvalMetric of classification from error to logloss

* Change default binary metric in plugin/example/custom_obj.cc

* Set old error metric in python tests

* Set old error metric in R tests

* Fix missed eval metrics and typos in R tests

* Fix setting eval_metric twice in R tests

* Add warning for empty eval_metric for classification

* Fix Dask tests

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-02 12:06:47 -07:00
Jiaming Yuan
7622b8cdb8
Enable categorical data support on Python DMatrix. (#6166)
* Only pandas is recognized.
2020-09-29 11:22:56 +08:00
Rory Mitchell
dda9e1e487
Update GPUTreeshap (#6163)
* Reduce shap test duration

* Test interoperability with shap package

* Add feature interactions

* Update GPUTreeShap
2020-09-28 09:43:47 +13:00
Kyle Nicholson
e6a238c020
Update base margin dask (#6155)
* Add `base-margin`
* Add `output_margin` to regressor.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-09-26 21:30:52 +08:00
Philip Hyunsu Cho
bd2b1eabd0
Add back support for scipy.sparse.coo_matrix (#6162) 2020-09-25 00:49:49 -07:00
Jiaming Yuan
33d80ffad0
[dask] Support more meta data on functional interface. (#6132)
* Add base_margin, label_(lower|upper)_bound.
* Test survival training with dask.
2020-09-21 16:56:37 +08:00
Rory Mitchell
47350f6acb
Allow kwargs in dask predict (#6117) 2020-09-15 13:04:03 +12:00
Jiaming Yuan
b5f52f0b1b
Validate weights are positive values. (#6115) 2020-09-15 09:03:55 +08:00
ShvetsKS
c1ca872d1e
Modin DF support (#6055)
* Modin DF support

* mode change

* tests were added, ci env was extended

* mode change

* Remove redundant installation of modin

* Add a pytest skip marker for modin

* Install Modin[ray] from PyPI

* fix interfering

* avoid extra conversion

* delete cv test for modin

* revert cv function

Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-29 22:33:30 +03:00
Jiaming Yuan
2fcc4f2886
Unify evaluation functions. (#6037) 2020-08-26 14:23:27 +08:00
Rory Mitchell
9a4e8b1d81
GPUTreeShap (#6038) 2020-08-25 12:47:41 +12:00