Jiaming Yuan
d4274bc556
Fix typo. ( #7433 )
2021-11-15 01:28:11 +08:00
Jiaming Yuan
a7057fa64c
Implement typed storage for tensor. ( #7429 )
...
* Add `Tensor` class.
* Add elementwise kernel for CPU and GPU.
* Add unravel index.
* Move some computation to compile time.
2021-11-14 18:53:13 +08:00
Jiaming Yuan
46726ec176
Expose build info ( #7399 )
2021-11-12 18:22:46 +08:00
Jiaming Yuan
937fa282b5
Extract string view. ( #7416 )
...
* Add equality operators.
* Return a view in substr.
* Add proper iterator types.
2021-11-12 18:22:30 +08:00
Jiaming Yuan
ca6f980932
Check number of trees in inplace predict. ( #7409 )
2021-11-12 18:20:23 +08:00
Jiaming Yuan
d7d1b6e3a6
CPU evaluation for cat data. ( #7393 )
...
* Implementation for one hot based.
* Implementation for partition based. (LightGBM)
2021-11-06 14:41:35 +08:00
Jiaming Yuan
6ede12412c
Update dmlc-core and use data iter for GPU sampling tests. ( #7398 )
...
* Update dmlc-core.
* New parquet parser in dmlc-core.
* Use data iter for GPU sampling tests.
2021-11-06 05:12:49 +08:00
Jiaming Yuan
c968217ca8
[R] Fix global feature importance and predict with 1 sample. ( #7394 )
...
* [R] Fix global feature importance.
* Add implementation for tree index. The parameter is not documented in C API since we
should work on porting the model slicing to R instead of supporting more use of tree
index.
* Fix the difference between "gain" and "total_gain".
* debug.
* Fix prediction.
2021-11-05 10:07:00 +08:00
Jiaming Yuan
b06040b6d0
Implement a general array view. ( #7365 )
...
* Replace existing matrix and vector view.
This is to prepare for handling higher dimension data and prediction when we support multi-target models.
2021-11-05 04:16:11 +08:00
Jiaming Yuan
4100827971
Pass infomation about objective to tree methods. ( #7385 )
...
* Define the `ObjInfo` and pass it down to every tree updater.
2021-11-04 01:52:44 +08:00
Jiaming Yuan
ccdabe4512
Support building gradient index with cat data. ( #7371 )
2021-11-03 22:37:37 +08:00
Jiaming Yuan
57a4b4ff64
Handle OMP_THREAD_LIMIT. ( #7390 )
2021-11-03 15:44:38 +08:00
Jiaming Yuan
a55d43ccfd
Add test for invalid categorical data values. ( #7380 )
...
* Add test for invalid categorical data values.
* Add check during sketching.
2021-11-02 18:00:52 +08:00
Jiaming Yuan
32e673d8c4
Support building with CTK11.5. ( #7379 )
...
* Support building with CTK11.5.
* Require system cub installation for CTK11.4+.
* Check thrust version for segmented sort.
2021-11-02 16:22:26 +08:00
Jiaming Yuan
a13321148a
Support multi-class with base margin. ( #7381 )
...
This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support
for most of the data types along with tests.
2021-11-02 13:38:00 +08:00
Jiaming Yuan
6295dc3b67
Fix span reverse iterator. ( #7387 )
...
* Fix span reverse iterator.
* Disable `rbegin` on device code to avoid calling host function.
* Add `trbegin` and friends.
2021-11-02 13:35:59 +08:00
Jiaming Yuan
0f7a9b42f1
Use double precision in metric calculation. ( #7364 )
2021-11-02 12:00:32 +08:00
Jiaming Yuan
d05754f558
Avoid OMP reduction in AUC. ( #7362 )
2021-10-28 05:03:52 +08:00
Jiaming Yuan
ac9bfaa4f2
Handle missing values in dataframe with category dtype. ( #7331 )
...
* Replace -1 in pandas initializer.
* Unify `IsValid` functor.
* Mimic pandas data handling in cuDF glue code.
* Check invalid categories.
* Fix DDM sketching.
2021-10-28 03:33:54 +08:00
Jiaming Yuan
d4349426d8
Re-implement PR-AUC. ( #7297 )
...
* Support binary/multi-class classification, ranking.
* Add documents.
* Handle missing data.
2021-10-26 13:07:50 +08:00
Jiaming Yuan
fd61c61071
Avoid omp reduction in rank metric. ( #7349 )
2021-10-22 14:13:34 +08:00
Jiaming Yuan
d1f00fb0b7
Stricter validation for group. ( #7345 )
2021-10-21 12:13:33 +08:00
Jiaming Yuan
8d7c6366d7
Accept histogram cut instead gradient index in evaluation. ( #7336 )
2021-10-20 18:04:46 +08:00
Jiaming Yuan
fbb0dc4275
Remove auto configuration of seed_per_iteration. ( #7009 )
...
* Remove auto configuration of seed_per_iteration.
This should be related to model recovery from rabit, which is removed.
* Document.
2021-10-17 15:58:57 +08:00
Jiaming Yuan
fb1a9e6bc5
Avoid omp reduction in coordinate descent and aft metrics. ( #7316 )
...
Aside from the omp issue, parameter configuration for aft metric is simplified.
2021-10-17 15:55:49 +08:00
Jiaming Yuan
8e619010d0
Extract CPUExpandEntry and HistParam. ( #7321 )
...
* Remove kRootNid.
* Check for empty hessian.
2021-10-17 14:22:25 +08:00
Jiaming Yuan
4ddf8d001c
Deterministic result for element-wise/mclass metrics. ( #7303 )
...
Remove openmp reduction.
2021-10-13 14:22:40 +08:00
Jiaming Yuan
a7d0c66457
Remove unused code. ( #7293 )
2021-10-12 15:04:41 +08:00
Jiaming Yuan
298af6f409
Fix weighted samples in multi-class AUC. ( #7300 )
2021-10-11 15:12:29 +08:00
Jiaming Yuan
d8cb395380
Fix gamma neg log likelihood. ( #7275 )
2021-10-05 16:57:08 +08:00
Jiaming Yuan
b3b03200e2
Remove old warning in 1.3 ( #7279 )
2021-10-01 08:05:50 +08:00
Jiaming Yuan
d8a549e6ac
Avoid thread block with sparse data. ( #7255 )
2021-09-25 13:11:34 +08:00
Jiaming Yuan
ca17f8a5fc
Dispatch thrust versions and upgrade rmm. ( #7254 )
...
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-09-25 03:43:23 +08:00
ShvetsKS
475fd1abec
Reduced span overheads in objective function calculate ( #7206 )
...
Co-authored-by: fis <jm.yuan@outlook.com>
2021-09-23 04:43:59 +08:00
david-cortes
4f93e5586a
Improve wording for warning ( #7248 )
...
This warning sounds a bit ungrammatical. Additionally, the second part of the warning is not clear. This PR changes the wording to make it clearer.
2021-09-21 10:48:11 +08:00
Jiaming Yuan
c311a8c1d8
Enable compiling with system cub. ( #7232 )
...
- Tested with all CUDA 11.x.
- Workaround cub scan by using discard iterator in AUC.
- Limit the size of Argsort when compiled with CUDA cub.
2021-09-17 14:28:18 +08:00
Jiaming Yuan
22d56cebf1
Encode pandas categorical data automatically. ( #7231 )
2021-09-17 11:09:55 +08:00
Jiaming Yuan
31c1e13f90
Categorical data support in CPU sketching. ( #7221 )
2021-09-17 04:37:09 +08:00
Jiaming Yuan
0ed979b096
Support more input types for categorical data. ( #7220 )
...
* Support more input types for categorical data.
* Shorten the type name from "categorical" to "c".
* Tests for np/cp array and scipy csr/csc/coo.
* Specify the type for feature info.
2021-09-16 20:39:30 +08:00
Jiaming Yuan
2942dc68e4
Fix mixed types in GPU sketching. ( #7228 )
2021-09-16 00:10:25 +08:00
Jiaming Yuan
3515931305
Initial support for external memory in gradient index. ( #7183 )
...
* Add hessian to batch param in preparation of new approx impl.
* Extract a push method for gradient index matrix.
* Use span instead of vector ref for hessian in sketching.
* Create a binary format for gradient index.
2021-09-13 12:40:56 +08:00
Jiaming Yuan
804b2ac60f
Expose DMatrix API for CUDA columnar and array. ( #7217 )
...
* Use JSON encoded configurations.
* Expose them into header file.
2021-09-09 17:55:25 +08:00
Jiaming Yuan
b12e7f7edd
Add noexcept to JSON objects. ( #7205 )
2021-09-07 13:56:48 +08:00
Jiaming Yuan
3a4f51f39f
Avoid calling CUDA code on CPU for linear model. ( #7154 )
2021-09-01 10:45:31 +08:00
Jiaming Yuan
ba69244a94
Restore the custom double atomic add. ( #7198 )
2021-08-28 18:30:42 +08:00
Jiaming Yuan
7a1d67f9cb
[breaking] Use integer atomic for GPU histogram. ( #7180 )
...
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.
[breaking] Drop non-deterministic histogram.
Use fixed point for shared memory.
This PR is to improve the performance of GPU Hist.
Co-authored-by: Andy Adinets <aadinets@nvidia.com>
2021-08-28 05:17:05 +08:00
Jiaming Yuan
e7d7ab6bc3
Better error message for ncclUnhandledCudaError. ( #7190 )
2021-08-27 10:29:22 +08:00
Jiaming Yuan
ee8d1f5ed8
Fix histogram truncation. ( #7181 )
...
* Fix truncation.
* Lint.
* lint.
2021-08-24 18:34:32 -07:00
Jiaming Yuan
bf562bd33c
Remove unused code. ( #7175 )
2021-08-18 14:02:19 +08:00
Jiaming Yuan
9600ca83f3
Remove synchronization in monitor. ( #7164 )
...
* Remove synchronization in monitor.
Calling rabit functions during destruction is flaky.
* Add xgboost prefix to nvtx marker.
2021-08-11 16:33:53 +08:00