5481 Commits

Author SHA1 Message Date
Jiaming Yuan
fab3c05ced
Move macos test to github action. (#7382) (#7392)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2021-11-03 18:39:47 +08:00
Jiaming Yuan
584b45a9cc
Release 1.5.0. (#7317) v1.5.0 2021-10-15 12:21:04 +08:00
Jiaming Yuan
30c1b5c54c
[backport] Fix prediction with cat data in sklearn interface. (#7306) (#7312)
* Specify DMatrix parameter for pre-processing dataframe.
* Add document about the behaviour of prediction.
2021-10-12 18:49:57 +08:00
Jiaming Yuan
36e247aca4
Fix weighted samples in multi-class AUC. (#7300) (#7305) 2021-10-11 18:00:36 +08:00
Jiaming Yuan
c4aff733bb
[backport] Fix cv verbose_eval (#7291) (#7296) 2021-10-08 14:24:27 +08:00
Jiaming Yuan
cdbfd21d31
[backport] Fix gamma neg log likelihood. (#7275) (#7285) 2021-10-05 23:01:11 +08:00
Jiaming Yuan
508a0b0dbd
[backport] [R] Fix document for nthread. (#7263) (#7269) 2021-09-28 14:41:32 +08:00
Jiaming Yuan
e04e773f9f
Add RC1 tag for building packages. (#7261) 2021-09-28 11:50:18 +08:00
Jiaming Yuan
1debabb321
Change version to 1.5.0. (#7258) v1.5.0rc1 2021-09-26 13:27:54 +08:00
Jiaming Yuan
d8a549e6ac
Avoid thread block with sparse data. (#7255) 2021-09-25 13:11:34 +08:00
Jiaming Yuan
ca17f8a5fc
Dispatch thrust versions and upgrade rmm. (#7254)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-09-25 03:43:23 +08:00
Jiaming Yuan
fbd58bf190
[jvm-packages] Create demo and test for xgboost4j early stopping. (#7252) 2021-09-25 03:29:27 +08:00
Bobby Wang
0ee11dac77
[jvm-packages][xgboost4j-gpu] Support GPU dataframe and DeviceQuantileDMatrix (#7195)
Following classes are added to support dataframe in java binding:

- `Column` is an abstract type for a single column in tabular data.
- `ColumnBatch` is an abstract type for dataframe.

- `CuDFColumn` is an implementaiton of `Column` that consume cuDF column
- `CudfColumnBatch` is an implementation of `ColumnBatch` that consumes cuDF dataframe.

- `DeviceQuantileDMatrix` is the interface for quantized data.

The Java implementation mimics the Python interface and uses `__cuda_array_interface__` protocol for memory indexing.  One difference is on JVM package, the data batch is staged on the host as java iterators cannot be reset.

Co-authored-by: jiamingy <jm.yuan@outlook.com>
2021-09-24 14:25:00 +08:00
Philip Hyunsu Cho
d27a427dc5
[CI] Rotate access keys for uploading MacOS artifacts from Travis CI (#7253) 2021-09-24 10:44:00 +08:00
ShvetsKS
475fd1abec
Reduced span overheads in objective function calculate (#7206)
Co-authored-by: fis <jm.yuan@outlook.com>
2021-09-23 04:43:59 +08:00
Jiaming Yuan
9472be7d77
Fix initialization from pandas series. (#7243) 2021-09-23 04:43:25 +08:00
david-cortes
4f93e5586a
Improve wording for warning (#7248)
This warning sounds  a bit ungrammatical. Additionally, the second part of the warning is not clear. This PR changes the wording to make it clearer.
2021-09-21 10:48:11 +08:00
Jiaming Yuan
18bd16341a
Update Python intro. [skip ci] (#7235)
* Fix the link to demo.
* Stop recommending text file inputs.
* Brief mention to scikit-learn interface.
* Fix indent warning in tree method doc.
2021-09-21 02:47:09 +00:00
david-cortes
61a619b5c3
[R] Avoid symbol naming conflicts with other packages (#7245)
* don't register all R symbols

* typo
2021-09-19 11:17:08 -07:00
Jiaming Yuan
e48e05e6e2
Add typehint to rabit module. (#7240) 2021-09-17 18:31:02 +08:00
Jiaming Yuan
c735c17f33
Disable callback and ES on random forest. (#7236) 2021-09-17 18:21:17 +08:00
Jiaming Yuan
c311a8c1d8
Enable compiling with system cub. (#7232)
- Tested with all CUDA 11.x.
- Workaround cub scan by using discard iterator in AUC.
- Limit the size of Argsort when compiled with CUDA cub.
2021-09-17 14:28:18 +08:00
Jiaming Yuan
b18f5f61b0
Fix pylint (#7241) 2021-09-17 11:50:36 +08:00
Jiaming Yuan
38a23f66a8
Fix typo in release script. [skip ci] (#7238) 2021-09-17 11:14:05 +08:00
Jiaming Yuan
8ad7e8eeb0
[doc] Fix typo. [skip ci] (#7226) 2021-09-17 11:13:49 +08:00
Jiaming Yuan
22d56cebf1
Encode pandas categorical data automatically. (#7231) 2021-09-17 11:09:55 +08:00
Jiaming Yuan
32e0858501
Fix travis. (#7237) 2021-09-17 10:06:23 +08:00
Jiaming Yuan
31c1e13f90
Categorical data support in CPU sketching. (#7221) 2021-09-17 04:37:09 +08:00
Jiaming Yuan
9f63d6fead
[jvm-packages] Deprecate constructors with implicit missing value. (#7225) 2021-09-17 04:35:04 +08:00
Jiaming Yuan
0ed979b096
Support more input types for categorical data. (#7220)
* Support more input types for categorical data.

* Shorten the type name from "categorical" to "c".
* Tests for np/cp array and scipy csr/csc/coo.
* Specify the type for feature info.
2021-09-16 20:39:30 +08:00
Jiaming Yuan
2942dc68e4
Fix mixed types in GPU sketching. (#7228) 2021-09-16 00:10:25 +08:00
Jiaming Yuan
037dd0820d
Implement __sklearn_is_fitted__. (#7230) 2021-09-15 19:09:04 +08:00
Jiaming Yuan
d997c967d5
Demo for experimental categorical data support. (#7213) 2021-09-15 08:20:12 +08:00
Jiaming Yuan
3515931305
Initial support for external memory in gradient index. (#7183)
* Add hessian to batch param in preparation of new approx impl.
* Extract a push method for gradient index matrix.
* Use span instead of vector ref for hessian in sketching.
* Create a binary format for gradient index.
2021-09-13 12:40:56 +08:00
Christian Lorentzen
a0dcf6f5c1
[DOC] Improve tutorial on feature interactions (#7219) 2021-09-12 21:40:02 +08:00
Jiaming Yuan
804b2ac60f
Expose DMatrix API for CUDA columnar and array. (#7217)
* Use JSON encoded configurations.
* Expose them into header file.
2021-09-09 17:55:25 +08:00
Jiaming Yuan
68a2c7b8d6
Fix memory leak in demo. (#7216) 2021-09-09 13:51:03 +08:00
Jiaming Yuan
b12e7f7edd
Add noexcept to JSON objects. (#7205) 2021-09-07 13:56:48 +08:00
Jiaming Yuan
3a4f51f39f
Avoid calling CUDA code on CPU for linear model. (#7154) 2021-09-01 10:45:31 +08:00
Jiaming Yuan
ba69244a94
Restore the custom double atomic add. (#7198) 2021-08-28 18:30:42 +08:00
Jiaming Yuan
7a1d67f9cb
[breaking] Use integer atomic for GPU histogram. (#7180)
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.

    [breaking] Drop non-deterministic histogram.
    Use fixed point for shared memory.

This PR is to improve the performance of GPU Hist. 

Co-authored-by: Andy Adinets <aadinets@nvidia.com>
2021-08-28 05:17:05 +08:00
Jiaming Yuan
e7d7ab6bc3
Better error message for ncclUnhandledCudaError. (#7190) 2021-08-27 10:29:22 +08:00
Philip Hyunsu Cho
b70e07da1f
[CI] Clean up in beginning of each task in Win CI (#7189) 2021-08-25 04:15:22 -07:00
Jiaming Yuan
cdfaa705f3
Fix building on CUDA 11.0. (#7187) 2021-08-25 02:57:53 -07:00
Philip Hyunsu Cho
3060f0b562
[CI] Automatically build GPU-enabled R package for Windows (#7185)
* [CI] Automatically build GPU-enabled R package for Windows

* Update Jenkinsfile-win64

* Build R package for the release branch only

* Update install doc
2021-08-25 02:11:01 -07:00
Jiaming Yuan
9c64618cb6
[breaking] Remove CUDA sm_35, add sm_86 (#7182) 2021-08-25 16:04:23 +08:00
Philip Hyunsu Cho
d04312b9c0
[CI] Fix hanging Python setup in Windows CI (#7186) 2021-08-24 22:03:51 -07:00
Jiaming Yuan
ee8d1f5ed8
Fix histogram truncation. (#7181)
* Fix truncation.

* Lint.

* lint.
2021-08-24 18:34:32 -07:00
Jiaming Yuan
3290a4f3ed
Re-enable feature validation in predict proba. (#7177) 2021-08-22 15:28:08 +08:00
Jiaming Yuan
bf562bd33c
Remove unused code. (#7175) 2021-08-18 14:02:19 +08:00