Jiaming Yuan
154b15060e
Move callbacks from fit to __init__. ( #7375 )
2021-11-02 17:51:42 +08:00
Jiaming Yuan
a13321148a
Support multi-class with base margin. ( #7381 )
...
This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support
for most of the data types along with tests.
2021-11-02 13:38:00 +08:00
Jiaming Yuan
0f7a9b42f1
Use double precision in metric calculation. ( #7364 )
2021-11-02 12:00:32 +08:00
Jiaming Yuan
c6769488b3
Typehint for subset of core API. ( #7348 )
2021-10-28 20:47:04 +08:00
Jiaming Yuan
45aef75cca
Move skl eval_metric and early_stopping rounds to model params. ( #6751 )
...
A new parameter `custom_metric` is added to `train` and `cv` to distinguish the behaviour from the old `feval`. And `feval` is deprecated. The new `custom_metric` receives transformed prediction when the built-in objective is used. This enables XGBoost to use cost functions from other libraries like scikit-learn directly without going through the definition of the link function.
`eval_metric` and `early_stopping_rounds` in sklearn interface are moved from `fit` to `__init__` and is now saved as part of the scikit-learn model. The old ones in `fit` function are now deprecated. The new `eval_metric` in `__init__` has the same new behaviour as `custom_metric`.
Added more detailed documents for the behaviour of custom objective and metric.
2021-10-28 17:20:20 +08:00
Jiaming Yuan
3c4aa9b2ea
[breaking] Remove label encoder deprecated in 1.3. ( #7357 )
2021-10-28 13:24:29 +08:00
Jiaming Yuan
ac9bfaa4f2
Handle missing values in dataframe with category dtype. ( #7331 )
...
* Replace -1 in pandas initializer.
* Unify `IsValid` functor.
* Mimic pandas data handling in cuDF glue code.
* Check invalid categories.
* Fix DDM sketching.
2021-10-28 03:33:54 +08:00
Jiaming Yuan
2eee87423c
Remove old custom objective demo. ( #7369 )
...
We have 2 new custom objective demos covering both regression and classification with
accompanying tutorials in documents.
2021-10-27 16:31:48 +08:00
Jiaming Yuan
d4349426d8
Re-implement PR-AUC. ( #7297 )
...
* Support binary/multi-class classification, ranking.
* Add documents.
* Handle missing data.
2021-10-26 13:07:50 +08:00
Jiaming Yuan
f999897615
[dask] Use nthread in DMatrix construction. ( #7337 )
...
This is consistent with the thread overriding behavior.
2021-10-20 15:16:40 +08:00
Jiaming Yuan
f53da412aa
Add typehint to tracker. ( #7338 )
2021-10-20 12:49:36 +08:00
Jiaming Yuan
298af6f409
Fix weighted samples in multi-class AUC. ( #7300 )
2021-10-11 15:12:29 +08:00
Jiaming Yuan
69d3b1b8b4
Remove old callback deprecated in 1.3. ( #7280 )
2021-10-08 17:24:59 +08:00
Jiaming Yuan
578de9f762
Fix cv verbose_eval ( #7291 )
2021-10-08 12:28:38 +08:00
Jiaming Yuan
d8cb395380
Fix gamma neg log likelihood. ( #7275 )
2021-10-05 16:57:08 +08:00
Jiaming Yuan
c735c17f33
Disable callback and ES on random forest. ( #7236 )
2021-09-17 18:21:17 +08:00
Jiaming Yuan
22d56cebf1
Encode pandas categorical data automatically. ( #7231 )
2021-09-17 11:09:55 +08:00
Jiaming Yuan
0ed979b096
Support more input types for categorical data. ( #7220 )
...
* Support more input types for categorical data.
* Shorten the type name from "categorical" to "c".
* Tests for np/cp array and scipy csr/csc/coo.
* Specify the type for feature info.
2021-09-16 20:39:30 +08:00
Jiaming Yuan
3f38d983a6
Fix prediction configuration. ( #7159 )
...
After the predictor parameter was added to the constructor, this configuration was broken.
2021-08-11 16:34:36 +08:00
Jiaming Yuan
8a84be37b8
Pass scikit learn estimator checks for regressor. ( #7130 )
...
* Check data shape.
* Check labels.
2021-08-03 18:58:20 +08:00
Jiaming Yuan
e2c406f5c8
Support min_delta in early stopping. ( #7137 )
...
* Support `min_delta` in early stopping.
* Remove abs_tol.
2021-08-03 14:29:17 +08:00
Jiaming Yuan
d080b5a953
Fix model slicing. ( #7149 )
...
* Use correct pointer.
* Remove best_iteration/best_score.
2021-08-03 11:51:56 +08:00
Jiaming Yuan
7ee7a95b84
Use upstream URI in distributed quantile tests. ( #7129 )
...
* Use upstream URI in distributed quantile tests.
* Fix test cv `PytestAssertRewriteWarning`.
2021-07-27 14:09:49 +08:00
Jiaming Yuan
e88ac9cc54
[dask] Extend tree stats tests. ( #7128 )
...
* Add tests to GPU.
* Assert cover in children sums up to the parent.
2021-07-27 12:22:13 +08:00
Jiaming Yuan
778135f657
Fix parameter loading with training continuation. ( #7121 )
...
* Add a demo for training continuation.
2021-07-23 10:51:47 +08:00
ShvetsKS
caa9e527dd
Remove extra sync for dense data ( #7120 )
...
Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
2021-07-22 19:02:31 +08:00
Jiaming Yuan
e6088366df
Export Python Interface for external memory. ( #7070 )
...
* Add Python iterator interface.
* Add tests.
* Add demo.
* Add documents.
* Handle empty dataset.
2021-07-22 15:15:53 +08:00
Jiaming Yuan
d7e1fa7664
Fix feature names and types in output model slice. ( #7078 )
2021-07-06 11:47:49 +08:00
Jiaming Yuan
ffa66aace0
Persist data in dask test. ( #7077 )
2021-07-06 11:47:17 +08:00
Jiaming Yuan
663136aa08
Implement feature score for linear model. ( #7048 )
...
* Add feature score support for linear model.
* Port R interface to the new implementation.
* Add linear model support in Python.
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-06-25 14:34:02 +08:00
Jiaming Yuan
da1ad798ca
Convert numpy float to Python float in feat score. ( #7047 )
2021-06-21 20:58:43 +08:00
Jiaming Yuan
7968c0d051
Test on s390x. ( #7038 )
...
* Fix && remove unused parameter.
2021-06-18 14:55:08 +08:00
Jiaming Yuan
86715e4cd4
Support categorical data for dask functional interface and DQM. ( #7043 )
...
* Support categorical data for dask functional interface and DQM.
* Implement categorical data support for GPU GK-merge.
* Add support for dask functional interface.
* Add support for DQM.
* Get newer cupy.
2021-06-18 13:06:52 +08:00
Jiaming Yuan
7dd29ffd47
Implement feature score in GBTree. ( #7041 )
...
* Categorical data support.
* Eliminate text parsing during feature score computation.
2021-06-18 11:53:16 +08:00
Jiaming Yuan
d9799b09d0
Categorical data support for cuDF. ( #7042 )
...
* Add support in DMatrix.
* Add support in DQM, except for iterator.
2021-06-17 13:54:33 +08:00
jmoralez
25514e104a
[dask] speed up tests ( #7020 )
2021-06-11 11:43:01 +08:00
Jiaming Yuan
72f9daf9b6
Fix gpu_id with custom objective. ( #7015 )
2021-06-09 14:51:17 +08:00
Jiaming Yuan
ee4f51a631
Support for all primitive types from array. ( #7003 )
...
* Change C API name.
* Test for all primitive types from array.
* Add native support for CPU 128 float.
* Convert boolean and float16 in Python.
* Fix dask version for now.
2021-06-01 08:34:48 +08:00
Jiaming Yuan
89a49cf30e
Fix dask predict on DaskDMatrix with iteration_range. ( #7005 )
2021-05-29 04:43:12 +08:00
Jiaming Yuan
ab6fd304c4
[Python] Change development release postfix to dev ( #6988 )
2021-05-27 16:06:51 +08:00
Jiaming Yuan
86e60e3ba8
Guard against index error in prediction. ( #6982 )
...
* Remove `best_ntree_limit` from documents.
2021-05-25 23:24:59 +08:00
Jiaming Yuan
d245bc891e
Add tolerance to early stopping. ( #6942 )
2021-05-14 00:19:51 +08:00
Jiaming Yuan
44cc9c04ea
Fix multiclass auc with empty dataset. ( #6947 )
2021-05-12 15:01:14 +08:00
Jiaming Yuan
05ac415780
[dask] Set dataframe index in predict. ( #6944 )
2021-05-12 13:24:21 +08:00
Jiaming Yuan
37ad60fe25
Enforce input data is not object. ( #6927 )
...
* Check for object data type.
* Allow strided arrays with greater underlying buffer size.
2021-05-02 00:09:01 +08:00
Jiaming Yuan
a1d23f6613
Relax test for decision stump in distributed environment. ( #6919 )
2021-04-30 09:04:11 +08:00
Jiaming Yuan
45ddc39c1d
Relax shotgun test. ( #6918 )
2021-04-30 09:03:12 +08:00
Jiaming Yuan
b31d37eac5
[CI] Fix custom metric test with empty dataset. ( #6917 )
2021-04-30 09:00:05 +08:00
Jiaming Yuan
8760ec4827
Ensure predict leaf output 1-dim vector where there's only 1 tree. ( #6889 )
2021-04-23 15:07:48 +08:00
Jiaming Yuan
54afa3ac7a
Relax shotgun test. ( #6900 )
...
It's non-deterministic algorithm, the test is flaky.
2021-04-23 13:01:44 +08:00