745 Commits

Author SHA1 Message Date
Jiaming Yuan
ee8d1f5ed8
Fix histogram truncation. (#7181)
* Fix truncation.

* Lint.

* lint.
2021-08-24 18:34:32 -07:00
Jiaming Yuan
3290a4f3ed
Re-enable feature validation in predict proba. (#7177) 2021-08-22 15:28:08 +08:00
Jiaming Yuan
3f38d983a6
Fix prediction configuration. (#7159)
After the predictor parameter was added to the constructor, this configuration was broken.
2021-08-11 16:34:36 +08:00
Jiaming Yuan
8a84be37b8
Pass scikit learn estimator checks for regressor. (#7130)
* Check data shape.
* Check labels.
2021-08-03 18:58:20 +08:00
Jiaming Yuan
e2c406f5c8
Support min_delta in early stopping. (#7137)
* Support `min_delta` in early stopping.

* Remove abs_tol.
2021-08-03 14:29:17 +08:00
Jiaming Yuan
d080b5a953
Fix model slicing. (#7149)
* Use correct pointer.
* Remove best_iteration/best_score.
2021-08-03 11:51:56 +08:00
Jiaming Yuan
1369133916
[dask] Remove the workaround for segfault. (#7146) 2021-07-30 03:57:53 +08:00
graue70
dfdf0b08fc
Fix typo and grammatical mistake in error message (#7134) 2021-07-28 17:17:05 +08:00
Gil Forsyth
92ae3abc97
[dask] Disallow importing non-dask estimators from xgboost.dask (#7133)
* Disallow importing non-dask estimators from xgboost.dask

This is mostly a style change, but also avoids a user error (that I have
committed on a few occasions).  Since `XGBRegressor` and `XGBClassifier`
are imported as parent classes for the `dask` estimators, without
defining an `__all__`, autocomplete (or muscle) memory will produce the
following with little prompting:

```
from xgboost.dask import XGBClassifier
```

There's nothing inherently wrong with that, but given that
`XGBClassifier` is not `dask` enabled, it can lead to confusing behavior
until you figure out you should've typed

```
from xgboost.dask import DaskXGBClassifier
```

Another option is to alias import the existing non-dask estimators.

* Remove base/iter class, add train predict funcs
2021-07-28 02:07:23 +08:00
Jiaming Yuan
7017dd5a26
[JVM-Packages] Use Python tracker in XGBoost for JVM package. (#7132) 2021-07-27 16:20:42 +08:00
Jiaming Yuan
778135f657
Fix parameter loading with training continuation. (#7121)
* Add a demo for training continuation.
2021-07-23 10:51:47 +08:00
Jiaming Yuan
e6088366df
Export Python Interface for external memory. (#7070)
* Add Python iterator interface.
* Add tests.
* Add demo.
* Add documents.
* Handle empty dataset.
2021-07-22 15:15:53 +08:00
Jiaming Yuan
2f524e9f41
[dask] Work around segfault in prediction. (#7112) 2021-07-16 04:27:05 +08:00
Jiaming Yuan
5d7cdf2e36
[Breaking] Rename Quantile DMatrix C API. (#7082)
The role of ProxyDMatrix is going beyond what it was designed.  Now it's used by both
QuantileDeviceDMatrix and inplace prediction.  After the refactoring of sparse DMatrix it
will also be used for external memory.  Renaming the C API to extract it from
QuantileDeviceDMatrix.
2021-07-08 11:34:14 +08:00
Jiaming Yuan
f937f514aa
Remove lz4 compression with external memory. (#7076) 2021-07-06 14:46:43 +08:00
Jiaming Yuan
b56d3d5d5c
Fix with latest panda range index. (#7074) 2021-07-03 16:43:52 +08:00
Jiaming Yuan
93f3acdef9
Fix with latest pylint. (#7071) 2021-07-02 21:26:00 +08:00
Jiaming Yuan
a5d222fcdb
Handle categorical split in model histogram and dataframe. (#7065)
* Error on get_split_value_histogram when feature is categorical
* Add a category column to output dataframe
2021-07-02 13:10:36 +08:00
Philip Hyunsu Cho
dd4db347f3
Fix early stopping behavior with MAPE metric (#7061) 2021-06-26 03:02:33 +08:00
Jiaming Yuan
663136aa08
Implement feature score for linear model. (#7048)
* Add feature score support for linear model.
* Port R interface to the new implementation.
* Add linear model support in Python.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-06-25 14:34:02 +08:00
Jiaming Yuan
1d4d345634
Tests for dask skl categorical data support. (#7054) 2021-06-24 16:33:57 +08:00
Jiaming Yuan
da1ad798ca
Convert numpy float to Python float in feat score. (#7047) 2021-06-21 20:58:43 +08:00
Jiaming Yuan
29f8fd6fee
Support categorical split in tree model dump. (#7036) 2021-06-18 16:46:20 +08:00
Jiaming Yuan
86715e4cd4
Support categorical data for dask functional interface and DQM. (#7043)
* Support categorical data for dask functional interface and DQM.

* Implement categorical data support for GPU GK-merge.
* Add support for dask functional interface.
* Add support for DQM.

* Get newer cupy.
2021-06-18 13:06:52 +08:00
Jiaming Yuan
7dd29ffd47
Implement feature score in GBTree. (#7041)
* Categorical data support.
* Eliminate text parsing during feature score computation.
2021-06-18 11:53:16 +08:00
Jiaming Yuan
d9799b09d0
Categorical data support for cuDF. (#7042)
* Add support in DMatrix.
* Add support in DQM, except for iterator.
2021-06-17 13:54:33 +08:00
Jiaming Yuan
b56614e9b8
[R] Use new predict function. (#6819)
* Call new C prediction API.
* Add `strict_shape`.
* Add `iterationrange`.
* Update document.
2021-06-11 13:03:29 +08:00
Jiaming Yuan
c4b9f4f622
Add enable_categorical to sklearn. (#7011) 2021-06-04 02:29:14 +08:00
Jiaming Yuan
ee4f51a631
Support for all primitive types from array. (#7003)
* Change C API name.
* Test for all primitive types from array.
* Add native support for CPU 128 float.
* Convert boolean and float16 in Python.

* Fix dask version for now.
2021-06-01 08:34:48 +08:00
Jiaming Yuan
816b789bf0
Add predictor to skl constructor. (#7000) 2021-05-29 04:52:56 +08:00
Jiaming Yuan
89a49cf30e
Fix dask predict on DaskDMatrix with iteration_range. (#7005) 2021-05-29 04:43:12 +08:00
Jiaming Yuan
4cf95a6041
Support numpy array interface (#6998) 2021-05-27 16:08:22 +08:00
Jiaming Yuan
ab6fd304c4
[Python] Change development release postfix to dev (#6988) 2021-05-27 16:06:51 +08:00
Jiaming Yuan
86e60e3ba8
Guard against index error in prediction. (#6982)
* Remove `best_ntree_limit` from documents.
2021-05-25 23:24:59 +08:00
Mads R. B. Kristensen
81bdfb835d
lazy_isinstance(): use .__class__ for type check (#6974) 2021-05-21 11:33:08 +08:00
Jiaming Yuan
7e846bb965
Fix prediction on df with latest dask. (#6969) 2021-05-19 12:23:03 +08:00
Jiaming Yuan
d245bc891e
Add tolerance to early stopping. (#6942) 2021-05-14 00:19:51 +08:00
Jiaming Yuan
05ac415780
[dask] Set dataframe index in predict. (#6944) 2021-05-12 13:24:21 +08:00
vslaykovsky
2a9979e256
Fixed incorrect feature mismatch error message (#6949)
data.shape[0] denotes the number of samples, data.shape[1] is the number of features
2021-05-11 13:52:11 +08:00
Daniel Saxton
e41619b1fc
Link to valid tree_method values in docs (#6935) 2021-05-06 17:33:18 +08:00
Jose Manuel Llorens
4ddbaeea32
Improve warning when using np.ndarray subsets (#6934) 2021-05-04 13:24:41 +08:00
Jiaming Yuan
37ad60fe25
Enforce input data is not object. (#6927)
* Check for object data type.
* Allow strided arrays with greater underlying buffer size.
2021-05-02 00:09:01 +08:00
Jiaming Yuan
ef473b1f09
Disable pylint error. (#6911) 2021-04-29 01:01:37 +08:00
Jiaming Yuan
a2ecbdaa31
Add an API guard to prevent global variables being changed. (#6891) 2021-04-23 10:27:57 +08:00
Jiaming Yuan
233bdf105f
Remove setDaemon in tracker. (#6872) 2021-04-22 01:57:13 +08:00
Jiaming Yuan
146549260a
Bump version to 1.5.0 snapshot in master. (#6875) 2021-04-22 01:53:44 +08:00
Jiaming Yuan
a5d7094a45
Update documents. (#6856)
* Add early stopping section to prediction doc.
* Remove best_ntree_limit.
* Better doxygen output.
2021-04-16 12:41:03 +08:00
Jiaming Yuan
dee5ef2dfd
Typehint for Sklearn. (#6799) 2021-04-14 06:55:21 +08:00
giladmaya
aa0d8f20c1
Support configuring constraints by feature names (#6783)
Co-authored-by: fis <jm.yuan@outlook.com>
2021-04-04 06:53:33 +08:00
Jiaming Yuan
7e06c81894
Fix approximated predict contribution. (#6811) 2021-04-03 02:15:03 +08:00