Jiaming Yuan
d4349426d8
Re-implement PR-AUC. ( #7297 )
...
* Support binary/multi-class classification, ranking.
* Add documents.
* Handle missing data.
2021-10-26 13:07:50 +08:00
Jiaming Yuan
d1f00fb0b7
Stricter validation for group. ( #7345 )
2021-10-21 12:13:33 +08:00
Jiaming Yuan
8d7c6366d7
Accept histogram cut instead gradient index in evaluation. ( #7336 )
2021-10-20 18:04:46 +08:00
Jiaming Yuan
f999897615
[dask] Use nthread in DMatrix construction. ( #7337 )
...
This is consistent with the thread overriding behavior.
2021-10-20 15:16:40 +08:00
Jiaming Yuan
3b0b74fa94
[doc] Use RTD theme. ( #7346 )
2021-10-19 23:49:19 -07:00
Jiaming Yuan
f53da412aa
Add typehint to tracker. ( #7338 )
2021-10-20 12:49:36 +08:00
Jiaming Yuan
fb1a9e6bc5
Avoid omp reduction in coordinate descent and aft metrics. ( #7316 )
...
Aside from the omp issue, parameter configuration for aft metric is simplified.
2021-10-17 15:55:49 +08:00
Jiaming Yuan
f56e2e9a66
Support categorical data with pandas Dataframe in inplace prediction ( #7322 )
2021-10-17 14:32:06 +08:00
Jiaming Yuan
8e619010d0
Extract CPUExpandEntry and HistParam. ( #7321 )
...
* Remove kRootNid.
* Check for empty hessian.
2021-10-17 14:22:25 +08:00
Jiaming Yuan
4ddf8d001c
Deterministic result for element-wise/mclass metrics. ( #7303 )
...
Remove openmp reduction.
2021-10-13 14:22:40 +08:00
Jiaming Yuan
130df8cdda
Add tests for tree grow policy. ( #7302 )
2021-10-12 15:04:06 +08:00
Jiaming Yuan
5b17bb0031
Fix prediction with cat data in sklearn interface. ( #7306 )
...
* Specify DMatrix parameter for pre-processing dataframe.
* Add document about the behaviour of prediction.
2021-10-12 14:31:12 +08:00
Jiaming Yuan
298af6f409
Fix weighted samples in multi-class AUC. ( #7300 )
2021-10-11 15:12:29 +08:00
Jiaming Yuan
69d3b1b8b4
Remove old callback deprecated in 1.3. ( #7280 )
2021-10-08 17:24:59 +08:00
Jiaming Yuan
578de9f762
Fix cv verbose_eval ( #7291 )
2021-10-08 12:28:38 +08:00
Jiaming Yuan
d8cb395380
Fix gamma neg log likelihood. ( #7275 )
2021-10-05 16:57:08 +08:00
Jiaming Yuan
d8a549e6ac
Avoid thread block with sparse data. ( #7255 )
2021-09-25 13:11:34 +08:00
Jiaming Yuan
ca17f8a5fc
Dispatch thrust versions and upgrade rmm. ( #7254 )
...
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-09-25 03:43:23 +08:00
Bobby Wang
0ee11dac77
[jvm-packages][xgboost4j-gpu] Support GPU dataframe and DeviceQuantileDMatrix ( #7195 )
...
Following classes are added to support dataframe in java binding:
- `Column` is an abstract type for a single column in tabular data.
- `ColumnBatch` is an abstract type for dataframe.
- `CuDFColumn` is an implementaiton of `Column` that consume cuDF column
- `CudfColumnBatch` is an implementation of `ColumnBatch` that consumes cuDF dataframe.
- `DeviceQuantileDMatrix` is the interface for quantized data.
The Java implementation mimics the Python interface and uses `__cuda_array_interface__` protocol for memory indexing. One difference is on JVM package, the data batch is staged on the host as java iterators cannot be reset.
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2021-09-24 14:25:00 +08:00
Jiaming Yuan
c735c17f33
Disable callback and ES on random forest. ( #7236 )
2021-09-17 18:21:17 +08:00
Jiaming Yuan
22d56cebf1
Encode pandas categorical data automatically. ( #7231 )
2021-09-17 11:09:55 +08:00
Jiaming Yuan
32e0858501
Fix travis. ( #7237 )
2021-09-17 10:06:23 +08:00
Jiaming Yuan
31c1e13f90
Categorical data support in CPU sketching. ( #7221 )
2021-09-17 04:37:09 +08:00
Jiaming Yuan
0ed979b096
Support more input types for categorical data. ( #7220 )
...
* Support more input types for categorical data.
* Shorten the type name from "categorical" to "c".
* Tests for np/cp array and scipy csr/csc/coo.
* Specify the type for feature info.
2021-09-16 20:39:30 +08:00
Jiaming Yuan
2942dc68e4
Fix mixed types in GPU sketching. ( #7228 )
2021-09-16 00:10:25 +08:00
Jiaming Yuan
037dd0820d
Implement __sklearn_is_fitted__. ( #7230 )
2021-09-15 19:09:04 +08:00
Jiaming Yuan
d997c967d5
Demo for experimental categorical data support. ( #7213 )
2021-09-15 08:20:12 +08:00
Jiaming Yuan
3515931305
Initial support for external memory in gradient index. ( #7183 )
...
* Add hessian to batch param in preparation of new approx impl.
* Extract a push method for gradient index matrix.
* Use span instead of vector ref for hessian in sketching.
* Create a binary format for gradient index.
2021-09-13 12:40:56 +08:00
Jiaming Yuan
b12e7f7edd
Add noexcept to JSON objects. ( #7205 )
2021-09-07 13:56:48 +08:00
Jiaming Yuan
3a4f51f39f
Avoid calling CUDA code on CPU for linear model. ( #7154 )
2021-09-01 10:45:31 +08:00
Jiaming Yuan
7a1d67f9cb
[breaking] Use integer atomic for GPU histogram. ( #7180 )
...
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.
[breaking] Drop non-deterministic histogram.
Use fixed point for shared memory.
This PR is to improve the performance of GPU Hist.
Co-authored-by: Andy Adinets <aadinets@nvidia.com>
2021-08-28 05:17:05 +08:00
Philip Hyunsu Cho
3060f0b562
[CI] Automatically build GPU-enabled R package for Windows ( #7185 )
...
* [CI] Automatically build GPU-enabled R package for Windows
* Update Jenkinsfile-win64
* Build R package for the release branch only
* Update install doc
2021-08-25 02:11:01 -07:00
Philip Hyunsu Cho
d04312b9c0
[CI] Fix hanging Python setup in Windows CI ( #7186 )
2021-08-24 22:03:51 -07:00
Jiaming Yuan
3f38d983a6
Fix prediction configuration. ( #7159 )
...
After the predictor parameter was added to the constructor, this configuration was broken.
2021-08-11 16:34:36 +08:00
Jiaming Yuan
149f209af6
Extract histogram builder from CPU Hist. ( #7152 )
...
* Extract the CPU histogram builder.
* Fix tests.
* Reduce number of histograms being built.
2021-08-09 21:15:21 +08:00
Jiaming Yuan
8a84be37b8
Pass scikit learn estimator checks for regressor. ( #7130 )
...
* Check data shape.
* Check labels.
2021-08-03 18:58:20 +08:00
Jiaming Yuan
e2c406f5c8
Support min_delta in early stopping. ( #7137 )
...
* Support `min_delta` in early stopping.
* Remove abs_tol.
2021-08-03 14:29:17 +08:00
Jiaming Yuan
7bdedacb54
Document for process_type. ( #7135 )
...
* Update document for prune and refresh.
* Add demo.
2021-08-03 13:11:52 +08:00
Jiaming Yuan
d080b5a953
Fix model slicing. ( #7149 )
...
* Use correct pointer.
* Remove best_iteration/best_score.
2021-08-03 11:51:56 +08:00
Philip Hyunsu Cho
f1a4a1ac95
[CI] Upgrade build image to CentOS 7 + GCC 8; require CUDA 10.1 and later ( #7141 )
2021-07-29 10:54:33 -07:00
Jiaming Yuan
7ee7a95b84
Use upstream URI in distributed quantile tests. ( #7129 )
...
* Use upstream URI in distributed quantile tests.
* Fix test cv `PytestAssertRewriteWarning`.
2021-07-27 14:09:49 +08:00
Jiaming Yuan
e88ac9cc54
[dask] Extend tree stats tests. ( #7128 )
...
* Add tests to GPU.
* Assert cover in children sums up to the parent.
2021-07-27 12:22:13 +08:00
Jiaming Yuan
778135f657
Fix parameter loading with training continuation. ( #7121 )
...
* Add a demo for training continuation.
2021-07-23 10:51:47 +08:00
ShvetsKS
caa9e527dd
Remove extra sync for dense data ( #7120 )
...
Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
2021-07-22 19:02:31 +08:00
Jiaming Yuan
e6088366df
Export Python Interface for external memory. ( #7070 )
...
* Add Python iterator interface.
* Add tests.
* Add demo.
* Add documents.
* Handle empty dataset.
2021-07-22 15:15:53 +08:00
Jiaming Yuan
bd1f3a38f0
Rewrite sparse dmatrix using callbacks. ( #7092 )
...
- Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves.
- Remove use of threaded iterator and IO queue.
- Remove `page_size`.
- Make sure the number of pages in memory is bounded.
- Make sure the cache can not be violated.
- Provide an interface for internal algorithms to process data asynchronously.
2021-07-16 12:33:31 +08:00
Philip Hyunsu Cho
2801d69fb7
[CI] Pin libomp to 11.1.0 ( #7107 )
2021-07-15 11:16:51 +08:00
Jiaming Yuan
345796825f
Optional find dependency in installed cmake config. ( #7099 )
...
* Find dependency only when xgboost is built as static library.
* Resolve msvc warning.
* Add test for linking shared library.
2021-07-11 17:20:55 +08:00
Jiaming Yuan
77f6cf2d13
Support hessian in host sketch container. ( #7081 )
...
Prepare for migrating approx onto hist's codebase.
2021-07-08 16:33:58 +08:00
Jiaming Yuan
84d359efb8
Support host data in proxy DMatrix. ( #7087 )
2021-07-08 11:35:48 +08:00