866 Commits

Author SHA1 Message Date
Jiaming Yuan
6b074add66
Update setup.py. (#7360)
* Add new classifiers.
* Typehint.
2021-10-28 14:58:31 +08:00
Jiaming Yuan
3c4aa9b2ea
[breaking] Remove label encoder deprecated in 1.3. (#7357) 2021-10-28 13:24:29 +08:00
Jiaming Yuan
ac9bfaa4f2
Handle missing values in dataframe with category dtype. (#7331)
* Replace -1 in pandas initializer.
* Unify `IsValid` functor.
* Mimic pandas data handling in cuDF glue code.
* Check invalid categories.
* Fix DDM sketching.
2021-10-28 03:33:54 +08:00
Jiaming Yuan
f999897615
[dask] Use nthread in DMatrix construction. (#7337)
This is consistent with the thread overriding behavior.
2021-10-20 15:16:40 +08:00
Jiaming Yuan
376b448015
[doc] Fix broken links. (#7341)
* Fix most of the link checks from sphinx.
* Remove duplicate explicit target name.
2021-10-20 14:45:30 +08:00
Jiaming Yuan
f53da412aa
Add typehint to tracker. (#7338) 2021-10-20 12:49:36 +08:00
Jiaming Yuan
c42e3fbcf3
[doc] Fix early stopping document. (#7334) 2021-10-18 11:21:16 -07:00
Jiaming Yuan
f56e2e9a66
Support categorical data with pandas Dataframe in inplace prediction (#7322) 2021-10-17 14:32:06 +08:00
Jiaming Yuan
5b17bb0031
Fix prediction with cat data in sklearn interface. (#7306)
* Specify DMatrix parameter for pre-processing dataframe.
* Add document about the behaviour of prediction.
2021-10-12 14:31:12 +08:00
Jiaming Yuan
69d3b1b8b4
Remove old callback deprecated in 1.3. (#7280) 2021-10-08 17:24:59 +08:00
Jiaming Yuan
578de9f762
Fix cv verbose_eval (#7291) 2021-10-08 12:28:38 +08:00
Jiaming Yuan
f7caac2563
Bump version to 1.6.0 in master. (#7259) 2021-10-07 16:09:26 +08:00
Jiaming Yuan
ca17f8a5fc
Dispatch thrust versions and upgrade rmm. (#7254)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-09-25 03:43:23 +08:00
Jiaming Yuan
9472be7d77
Fix initialization from pandas series. (#7243) 2021-09-23 04:43:25 +08:00
Jiaming Yuan
e48e05e6e2
Add typehint to rabit module. (#7240) 2021-09-17 18:31:02 +08:00
Jiaming Yuan
c735c17f33
Disable callback and ES on random forest. (#7236) 2021-09-17 18:21:17 +08:00
Jiaming Yuan
b18f5f61b0
Fix pylint (#7241) 2021-09-17 11:50:36 +08:00
Jiaming Yuan
22d56cebf1
Encode pandas categorical data automatically. (#7231) 2021-09-17 11:09:55 +08:00
Jiaming Yuan
0ed979b096
Support more input types for categorical data. (#7220)
* Support more input types for categorical data.

* Shorten the type name from "categorical" to "c".
* Tests for np/cp array and scipy csr/csc/coo.
* Specify the type for feature info.
2021-09-16 20:39:30 +08:00
Jiaming Yuan
037dd0820d
Implement __sklearn_is_fitted__. (#7230) 2021-09-15 19:09:04 +08:00
Jiaming Yuan
804b2ac60f
Expose DMatrix API for CUDA columnar and array. (#7217)
* Use JSON encoded configurations.
* Expose them into header file.
2021-09-09 17:55:25 +08:00
Jiaming Yuan
ee8d1f5ed8
Fix histogram truncation. (#7181)
* Fix truncation.

* Lint.

* lint.
2021-08-24 18:34:32 -07:00
Jiaming Yuan
3290a4f3ed
Re-enable feature validation in predict proba. (#7177) 2021-08-22 15:28:08 +08:00
Jiaming Yuan
3f38d983a6
Fix prediction configuration. (#7159)
After the predictor parameter was added to the constructor, this configuration was broken.
2021-08-11 16:34:36 +08:00
Jiaming Yuan
8a84be37b8
Pass scikit learn estimator checks for regressor. (#7130)
* Check data shape.
* Check labels.
2021-08-03 18:58:20 +08:00
Jiaming Yuan
e2c406f5c8
Support min_delta in early stopping. (#7137)
* Support `min_delta` in early stopping.

* Remove abs_tol.
2021-08-03 14:29:17 +08:00
Jiaming Yuan
d080b5a953
Fix model slicing. (#7149)
* Use correct pointer.
* Remove best_iteration/best_score.
2021-08-03 11:51:56 +08:00
Jiaming Yuan
1369133916
[dask] Remove the workaround for segfault. (#7146) 2021-07-30 03:57:53 +08:00
graue70
dfdf0b08fc
Fix typo and grammatical mistake in error message (#7134) 2021-07-28 17:17:05 +08:00
Gil Forsyth
92ae3abc97
[dask] Disallow importing non-dask estimators from xgboost.dask (#7133)
* Disallow importing non-dask estimators from xgboost.dask

This is mostly a style change, but also avoids a user error (that I have
committed on a few occasions).  Since `XGBRegressor` and `XGBClassifier`
are imported as parent classes for the `dask` estimators, without
defining an `__all__`, autocomplete (or muscle) memory will produce the
following with little prompting:

```
from xgboost.dask import XGBClassifier
```

There's nothing inherently wrong with that, but given that
`XGBClassifier` is not `dask` enabled, it can lead to confusing behavior
until you figure out you should've typed

```
from xgboost.dask import DaskXGBClassifier
```

Another option is to alias import the existing non-dask estimators.

* Remove base/iter class, add train predict funcs
2021-07-28 02:07:23 +08:00
Jiaming Yuan
7017dd5a26
[JVM-Packages] Use Python tracker in XGBoost for JVM package. (#7132) 2021-07-27 16:20:42 +08:00
Jiaming Yuan
778135f657
Fix parameter loading with training continuation. (#7121)
* Add a demo for training continuation.
2021-07-23 10:51:47 +08:00
Jiaming Yuan
e6088366df
Export Python Interface for external memory. (#7070)
* Add Python iterator interface.
* Add tests.
* Add demo.
* Add documents.
* Handle empty dataset.
2021-07-22 15:15:53 +08:00
Jiaming Yuan
2f524e9f41
[dask] Work around segfault in prediction. (#7112) 2021-07-16 04:27:05 +08:00
Jiaming Yuan
5d7cdf2e36
[Breaking] Rename Quantile DMatrix C API. (#7082)
The role of ProxyDMatrix is going beyond what it was designed.  Now it's used by both
QuantileDeviceDMatrix and inplace prediction.  After the refactoring of sparse DMatrix it
will also be used for external memory.  Renaming the C API to extract it from
QuantileDeviceDMatrix.
2021-07-08 11:34:14 +08:00
Jiaming Yuan
f937f514aa
Remove lz4 compression with external memory. (#7076) 2021-07-06 14:46:43 +08:00
Jiaming Yuan
b56d3d5d5c
Fix with latest panda range index. (#7074) 2021-07-03 16:43:52 +08:00
Jiaming Yuan
93f3acdef9
Fix with latest pylint. (#7071) 2021-07-02 21:26:00 +08:00
Jiaming Yuan
a5d222fcdb
Handle categorical split in model histogram and dataframe. (#7065)
* Error on get_split_value_histogram when feature is categorical
* Add a category column to output dataframe
2021-07-02 13:10:36 +08:00
Philip Hyunsu Cho
dd4db347f3
Fix early stopping behavior with MAPE metric (#7061) 2021-06-26 03:02:33 +08:00
Jiaming Yuan
663136aa08
Implement feature score for linear model. (#7048)
* Add feature score support for linear model.
* Port R interface to the new implementation.
* Add linear model support in Python.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-06-25 14:34:02 +08:00
Jiaming Yuan
1d4d345634
Tests for dask skl categorical data support. (#7054) 2021-06-24 16:33:57 +08:00
Jiaming Yuan
da1ad798ca
Convert numpy float to Python float in feat score. (#7047) 2021-06-21 20:58:43 +08:00
Jiaming Yuan
29f8fd6fee
Support categorical split in tree model dump. (#7036) 2021-06-18 16:46:20 +08:00
Jiaming Yuan
86715e4cd4
Support categorical data for dask functional interface and DQM. (#7043)
* Support categorical data for dask functional interface and DQM.

* Implement categorical data support for GPU GK-merge.
* Add support for dask functional interface.
* Add support for DQM.

* Get newer cupy.
2021-06-18 13:06:52 +08:00
Jiaming Yuan
7dd29ffd47
Implement feature score in GBTree. (#7041)
* Categorical data support.
* Eliminate text parsing during feature score computation.
2021-06-18 11:53:16 +08:00
Jiaming Yuan
d9799b09d0
Categorical data support for cuDF. (#7042)
* Add support in DMatrix.
* Add support in DQM, except for iterator.
2021-06-17 13:54:33 +08:00
Jiaming Yuan
b56614e9b8
[R] Use new predict function. (#6819)
* Call new C prediction API.
* Add `strict_shape`.
* Add `iterationrange`.
* Update document.
2021-06-11 13:03:29 +08:00
Jiaming Yuan
c4b9f4f622
Add enable_categorical to sklearn. (#7011) 2021-06-04 02:29:14 +08:00
Jiaming Yuan
ee4f51a631
Support for all primitive types from array. (#7003)
* Change C API name.
* Test for all primitive types from array.
* Add native support for CPU 128 float.
* Convert boolean and float16 in Python.

* Fix dask version for now.
2021-06-01 08:34:48 +08:00