Jiaming Yuan
a013942649
Check number of trees in inplace predict. ( #7409 ) ( #7424 )
2021-11-12 19:31:31 +08:00
Jiaming Yuan
4d2ea0d4ef
[backport] [doc] Fix broken links. ( #7341 ) ( #7418 )
...
* Fix most of the link checks from sphinx.
* Remove duplicate explicit target name.
2021-11-11 19:33:02 +08:00
Jiaming Yuan
d1052b5cfe
[jvm-packages] Fix json4s binary compatibility issue ( #7376 ) ( #7414 )
...
Spark 3.2 depends on 3.7.0-M11 which has changed some implicited functions'
signatures. And it will result the xgboost4j built against spark 3.0/3.1
failed when saving the model.
Co-authored-by: Bobby Wang <wbo4958@gmail.com>
2021-11-10 21:25:11 +08:00
Jiaming Yuan
14c56f05da
[backport] Handle missing values in dataframe with category dtype. ( #7331 ) ( #7413 )
...
* Handle missing values in dataframe with category dtype. (#7331 )
* Replace -1 in pandas initializer.
* Unify `IsValid` functor.
* Mimic pandas data handling in cuDF glue code.
* Check invalid categories.
* Fix DDM sketching.
* Fix pick error.
2021-11-10 21:24:46 +08:00
Jiaming Yuan
11f8b5cfcd
[backport] Support building with CTK11.5. ( #7379 ) ( #7411 )
...
* Support building with CTK11.5.
* Require system cub installation for CTK11.4+.
* Check thrust version for segmented sort.
2021-11-10 19:23:29 +08:00
Jiaming Yuan
e7ac2486eb
[backport] [R] Fix global feature importance and predict with 1 sample. ( #7394 ) ( #7397 )
...
* [R] Fix global feature importance.
* Add implementation for tree index. The parameter is not documented in C API since we
should work on porting the model slicing to R instead of supporting more use of tree
index.
* Fix the difference between "gain" and "total_gain".
* debug.
* Fix prediction.
2021-11-06 00:07:36 +08:00
Jiaming Yuan
a3d195e73e
Handle OMP_THREAD_LIMIT. ( #7390 ) ( #7391 )
2021-11-03 20:25:51 +08:00
Jiaming Yuan
fab3c05ced
Move macos test to github action. ( #7382 ) ( #7392 )
...
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2021-11-03 18:39:47 +08:00
Jiaming Yuan
584b45a9cc
Release 1.5.0. ( #7317 )
v1.5.0
2021-10-15 12:21:04 +08:00
Jiaming Yuan
30c1b5c54c
[backport] Fix prediction with cat data in sklearn interface. ( #7306 ) ( #7312 )
...
* Specify DMatrix parameter for pre-processing dataframe.
* Add document about the behaviour of prediction.
2021-10-12 18:49:57 +08:00
Jiaming Yuan
36e247aca4
Fix weighted samples in multi-class AUC. ( #7300 ) ( #7305 )
2021-10-11 18:00:36 +08:00
Jiaming Yuan
c4aff733bb
[backport] Fix cv verbose_eval ( #7291 ) ( #7296 )
2021-10-08 14:24:27 +08:00
Jiaming Yuan
cdbfd21d31
[backport] Fix gamma neg log likelihood. ( #7275 ) ( #7285 )
2021-10-05 23:01:11 +08:00
Jiaming Yuan
508a0b0dbd
[backport] [R] Fix document for nthread. ( #7263 ) ( #7269 )
2021-09-28 14:41:32 +08:00
Jiaming Yuan
e04e773f9f
Add RC1 tag for building packages. ( #7261 )
2021-09-28 11:50:18 +08:00
Jiaming Yuan
1debabb321
Change version to 1.5.0. ( #7258 )
v1.5.0rc1
2021-09-26 13:27:54 +08:00
Jiaming Yuan
d8a549e6ac
Avoid thread block with sparse data. ( #7255 )
2021-09-25 13:11:34 +08:00
Jiaming Yuan
ca17f8a5fc
Dispatch thrust versions and upgrade rmm. ( #7254 )
...
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-09-25 03:43:23 +08:00
Jiaming Yuan
fbd58bf190
[jvm-packages] Create demo and test for xgboost4j early stopping. ( #7252 )
2021-09-25 03:29:27 +08:00
Bobby Wang
0ee11dac77
[jvm-packages][xgboost4j-gpu] Support GPU dataframe and DeviceQuantileDMatrix ( #7195 )
...
Following classes are added to support dataframe in java binding:
- `Column` is an abstract type for a single column in tabular data.
- `ColumnBatch` is an abstract type for dataframe.
- `CuDFColumn` is an implementaiton of `Column` that consume cuDF column
- `CudfColumnBatch` is an implementation of `ColumnBatch` that consumes cuDF dataframe.
- `DeviceQuantileDMatrix` is the interface for quantized data.
The Java implementation mimics the Python interface and uses `__cuda_array_interface__` protocol for memory indexing. One difference is on JVM package, the data batch is staged on the host as java iterators cannot be reset.
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2021-09-24 14:25:00 +08:00
Philip Hyunsu Cho
d27a427dc5
[CI] Rotate access keys for uploading MacOS artifacts from Travis CI ( #7253 )
2021-09-24 10:44:00 +08:00
ShvetsKS
475fd1abec
Reduced span overheads in objective function calculate ( #7206 )
...
Co-authored-by: fis <jm.yuan@outlook.com>
2021-09-23 04:43:59 +08:00
Jiaming Yuan
9472be7d77
Fix initialization from pandas series. ( #7243 )
2021-09-23 04:43:25 +08:00
david-cortes
4f93e5586a
Improve wording for warning ( #7248 )
...
This warning sounds a bit ungrammatical. Additionally, the second part of the warning is not clear. This PR changes the wording to make it clearer.
2021-09-21 10:48:11 +08:00
Jiaming Yuan
18bd16341a
Update Python intro. [skip ci] ( #7235 )
...
* Fix the link to demo.
* Stop recommending text file inputs.
* Brief mention to scikit-learn interface.
* Fix indent warning in tree method doc.
2021-09-21 02:47:09 +00:00
david-cortes
61a619b5c3
[R] Avoid symbol naming conflicts with other packages ( #7245 )
...
* don't register all R symbols
* typo
2021-09-19 11:17:08 -07:00
Jiaming Yuan
e48e05e6e2
Add typehint to rabit module. ( #7240 )
2021-09-17 18:31:02 +08:00
Jiaming Yuan
c735c17f33
Disable callback and ES on random forest. ( #7236 )
2021-09-17 18:21:17 +08:00
Jiaming Yuan
c311a8c1d8
Enable compiling with system cub. ( #7232 )
...
- Tested with all CUDA 11.x.
- Workaround cub scan by using discard iterator in AUC.
- Limit the size of Argsort when compiled with CUDA cub.
2021-09-17 14:28:18 +08:00
Jiaming Yuan
b18f5f61b0
Fix pylint ( #7241 )
2021-09-17 11:50:36 +08:00
Jiaming Yuan
38a23f66a8
Fix typo in release script. [skip ci] ( #7238 )
2021-09-17 11:14:05 +08:00
Jiaming Yuan
8ad7e8eeb0
[doc] Fix typo. [skip ci] ( #7226 )
2021-09-17 11:13:49 +08:00
Jiaming Yuan
22d56cebf1
Encode pandas categorical data automatically. ( #7231 )
2021-09-17 11:09:55 +08:00
Jiaming Yuan
32e0858501
Fix travis. ( #7237 )
2021-09-17 10:06:23 +08:00
Jiaming Yuan
31c1e13f90
Categorical data support in CPU sketching. ( #7221 )
2021-09-17 04:37:09 +08:00
Jiaming Yuan
9f63d6fead
[jvm-packages] Deprecate constructors with implicit missing value. ( #7225 )
2021-09-17 04:35:04 +08:00
Jiaming Yuan
0ed979b096
Support more input types for categorical data. ( #7220 )
...
* Support more input types for categorical data.
* Shorten the type name from "categorical" to "c".
* Tests for np/cp array and scipy csr/csc/coo.
* Specify the type for feature info.
2021-09-16 20:39:30 +08:00
Jiaming Yuan
2942dc68e4
Fix mixed types in GPU sketching. ( #7228 )
2021-09-16 00:10:25 +08:00
Jiaming Yuan
037dd0820d
Implement __sklearn_is_fitted__. ( #7230 )
2021-09-15 19:09:04 +08:00
Jiaming Yuan
d997c967d5
Demo for experimental categorical data support. ( #7213 )
2021-09-15 08:20:12 +08:00
Jiaming Yuan
3515931305
Initial support for external memory in gradient index. ( #7183 )
...
* Add hessian to batch param in preparation of new approx impl.
* Extract a push method for gradient index matrix.
* Use span instead of vector ref for hessian in sketching.
* Create a binary format for gradient index.
2021-09-13 12:40:56 +08:00
Christian Lorentzen
a0dcf6f5c1
[DOC] Improve tutorial on feature interactions ( #7219 )
2021-09-12 21:40:02 +08:00
Jiaming Yuan
804b2ac60f
Expose DMatrix API for CUDA columnar and array. ( #7217 )
...
* Use JSON encoded configurations.
* Expose them into header file.
2021-09-09 17:55:25 +08:00
Jiaming Yuan
68a2c7b8d6
Fix memory leak in demo. ( #7216 )
2021-09-09 13:51:03 +08:00
Jiaming Yuan
b12e7f7edd
Add noexcept to JSON objects. ( #7205 )
2021-09-07 13:56:48 +08:00
Jiaming Yuan
3a4f51f39f
Avoid calling CUDA code on CPU for linear model. ( #7154 )
2021-09-01 10:45:31 +08:00
Jiaming Yuan
ba69244a94
Restore the custom double atomic add. ( #7198 )
2021-08-28 18:30:42 +08:00
Jiaming Yuan
7a1d67f9cb
[breaking] Use integer atomic for GPU histogram. ( #7180 )
...
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.
[breaking] Drop non-deterministic histogram.
Use fixed point for shared memory.
This PR is to improve the performance of GPU Hist.
Co-authored-by: Andy Adinets <aadinets@nvidia.com>
2021-08-28 05:17:05 +08:00
Jiaming Yuan
e7d7ab6bc3
Better error message for ncclUnhandledCudaError. ( #7190 )
2021-08-27 10:29:22 +08:00
Philip Hyunsu Cho
b70e07da1f
[CI] Clean up in beginning of each task in Win CI ( #7189 )
2021-08-25 04:15:22 -07:00