635 Commits

Author SHA1 Message Date
Jiaming Yuan
d7d1b6e3a6
CPU evaluation for cat data. (#7393)
* Implementation for one hot based.
* Implementation for partition based. (LightGBM)
2021-11-06 14:41:35 +08:00
Jiaming Yuan
ccdabe4512
Support building gradient index with cat data. (#7371) 2021-11-03 22:37:37 +08:00
Jiaming Yuan
57a4b4ff64
Handle OMP_THREAD_LIMIT. (#7390) 2021-11-03 15:44:38 +08:00
Jiaming Yuan
a55d43ccfd
Add test for invalid categorical data values. (#7380)
* Add test for invalid categorical data values.

* Add check during sketching.
2021-11-02 18:00:52 +08:00
Jiaming Yuan
32e673d8c4
Support building with CTK11.5. (#7379)
* Support building with CTK11.5.

* Require system cub installation for CTK11.4+.
* Check thrust version for segmented sort.
2021-11-02 16:22:26 +08:00
Jiaming Yuan
6295dc3b67
Fix span reverse iterator. (#7387)
* Fix span reverse iterator.

* Disable `rbegin` on device code to avoid calling host function.
* Add `trbegin` and friends.
2021-11-02 13:35:59 +08:00
Jiaming Yuan
ac9bfaa4f2
Handle missing values in dataframe with category dtype. (#7331)
* Replace -1 in pandas initializer.
* Unify `IsValid` functor.
* Mimic pandas data handling in cuDF glue code.
* Check invalid categories.
* Fix DDM sketching.
2021-10-28 03:33:54 +08:00
Jiaming Yuan
d4349426d8
Re-implement PR-AUC. (#7297)
* Support binary/multi-class classification, ranking.
* Add documents.
* Handle missing data.
2021-10-26 13:07:50 +08:00
Jiaming Yuan
ca17f8a5fc
Dispatch thrust versions and upgrade rmm. (#7254)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-09-25 03:43:23 +08:00
Jiaming Yuan
c311a8c1d8
Enable compiling with system cub. (#7232)
- Tested with all CUDA 11.x.
- Workaround cub scan by using discard iterator in AUC.
- Limit the size of Argsort when compiled with CUDA cub.
2021-09-17 14:28:18 +08:00
Jiaming Yuan
31c1e13f90
Categorical data support in CPU sketching. (#7221) 2021-09-17 04:37:09 +08:00
Jiaming Yuan
2942dc68e4
Fix mixed types in GPU sketching. (#7228) 2021-09-16 00:10:25 +08:00
Jiaming Yuan
3515931305
Initial support for external memory in gradient index. (#7183)
* Add hessian to batch param in preparation of new approx impl.
* Extract a push method for gradient index matrix.
* Use span instead of vector ref for hessian in sketching.
* Create a binary format for gradient index.
2021-09-13 12:40:56 +08:00
Jiaming Yuan
b12e7f7edd
Add noexcept to JSON objects. (#7205) 2021-09-07 13:56:48 +08:00
Jiaming Yuan
ba69244a94
Restore the custom double atomic add. (#7198) 2021-08-28 18:30:42 +08:00
Jiaming Yuan
7a1d67f9cb
[breaking] Use integer atomic for GPU histogram. (#7180)
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.

    [breaking] Drop non-deterministic histogram.
    Use fixed point for shared memory.

This PR is to improve the performance of GPU Hist. 

Co-authored-by: Andy Adinets <aadinets@nvidia.com>
2021-08-28 05:17:05 +08:00
Jiaming Yuan
e7d7ab6bc3
Better error message for ncclUnhandledCudaError. (#7190) 2021-08-27 10:29:22 +08:00
Jiaming Yuan
9600ca83f3
Remove synchronization in monitor. (#7164)
* Remove synchronization in monitor.

Calling rabit functions during destruction is flaky.

* Add xgboost prefix to nvtx marker.
2021-08-11 16:33:53 +08:00
Jiaming Yuan
149f209af6
Extract histogram builder from CPU Hist. (#7152)
* Extract the CPU histogram builder.
* Fix tests.
* Reduce number of histograms being built.
2021-08-09 21:15:21 +08:00
Robert Maynard
1a75f43304
Allow compilation with nvcc 11.4 (#7131)
* Use type aliases for discard iterators

* update to include host_vector as thrust 1.12 doesn't bring it in as a side-effect

* cub::DispatchRadixSort requires signed offset types
2021-07-27 20:05:33 +08:00
Taewoo Kim
41e882f80b
Check input value is duplicated when quantile queue is full (#7091)
Co-authored-by: Taewoo Kim <taewoo@layer6.com>
2021-07-23 03:07:01 +08:00
farfarawayzyt
e64ee6592f
fix typo in src/common/hist.cc BuildHistKernel (#7116) 2021-07-21 19:53:05 +08:00
farfarawayzyt
d7c14496d2
fix typo in arguments of PartitionBuilder::Init (#7113)
Co-authored-by: Yuntian Zhang <zhangyt@lamda.nju.edu.cn>
2021-07-16 15:46:22 +08:00
Jiaming Yuan
bd1f3a38f0
Rewrite sparse dmatrix using callbacks. (#7092)
- Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves.
- Remove use of threaded iterator and IO queue.
- Remove `page_size`.
- Make sure the number of pages in memory is bounded.
- Make sure the cache can not be violated.
- Provide an interface for internal algorithms to process data asynchronously.
2021-07-16 12:33:31 +08:00
Jiaming Yuan
77f6cf2d13
Support hessian in host sketch container. (#7081)
Prepare for migrating approx onto hist's codebase.
2021-07-08 16:33:58 +08:00
Jiaming Yuan
1cd20efe68
Move GHistIndex into DMatrix. (#7064) 2021-07-01 00:44:49 +08:00
Jiaming Yuan
1c8fdf2218
Remove use of device_idx in dh::LaunchN. (#7063)
It's an unused parameter, removing it can make the CI log more readable.
2021-06-29 11:37:26 +08:00
Jiaming Yuan
8fa32fdda2
Implement categorical data support for SHAP. (#7053)
* Add CPU implementation.
* Update GPUTreeSHAP.
* Add GPU implementation by defining custom split condition.
2021-06-25 19:02:46 +08:00
Jiaming Yuan
86715e4cd4
Support categorical data for dask functional interface and DQM. (#7043)
* Support categorical data for dask functional interface and DQM.

* Implement categorical data support for GPU GK-merge.
* Add support for dask functional interface.
* Add support for DQM.

* Get newer cupy.
2021-06-18 13:06:52 +08:00
ShvetsKS
2567404ab6
Simplify sparse and dense CPU hist kernels (#7029)
* Simplify sparse and dense kernels
* Extract row partitioner.

Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>
2021-06-11 18:26:30 +08:00
TP Boudreau
bd2ca543c4
Fix BinarySearchBin() argument types (#7026) 2021-06-08 19:05:46 +08:00
ShvetsKS
5cdaac00c1
Remove feature grouping (#7018)
Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>
2021-06-03 04:35:26 +08:00
Andrew Ziem
3e7e426b36
Fix spelling in documents (#6948)
* Update roxygen2 doc.

Co-authored-by: fis <jm.yuan@outlook.com>
2021-05-11 20:44:36 +08:00
Jiaming Yuan
a2ecbdaa31
Add an API guard to prevent global variables being changed. (#6891) 2021-04-23 10:27:57 +08:00
Jiaming Yuan
1b26a2a561
Copy output data for argsort. (#6866)
Fix GPU AUC.
2021-04-16 21:05:01 +08:00
Jiaming Yuan
f294c4e023
Use constexpr in dh::CopyIf. (#6828) 2021-04-08 07:37:47 +08:00
Jiaming Yuan
7bcc8b3e5c
Use batched copy if. (#6826) 2021-04-06 10:34:04 +08:00
Jiaming Yuan
3039dd194b
Don't estimate sketch batch size when rmm is used. (#6807) 2021-03-31 15:29:56 +08:00
ShvetsKS
8825670c9c
Memory consumption fix for row-major adapters (#6779)
Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>
Co-authored-by: fis <jm.yuan@outlook.com>
2021-03-26 08:44:30 +08:00
Jiaming Yuan
a7083d3c13
Fix dart inplace prediction with GPU input. (#6777)
* Fix dart inplace predict with data on GPU, which might trigger a fatal check
for device access right.
* Avoid copying data whenever possible.
2021-03-25 12:00:32 +08:00
Jiaming Yuan
bcc0277338
Re-implement ROC-AUC. (#6747)
* Re-implement ROC-AUC.

* Binary
* MultiClass
* LTR
* Add documents.

This PR resolves a few issues:
  - Define a value when the dataset is invalid, which can happen if there's an
  empty dataset, or when the dataset contains only positive or negative values.
  - Define ROC-AUC for multi-class classification.
  - Define weighted average value for distributed setting.
  - A correct implementation for learning to rank task.  Previous
  implementation is just binary classification with averaging across groups,
  which doesn't measure ordered learning to rank.
2021-03-20 16:52:40 +08:00
Philip Hyunsu Cho
4230dcb614
Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist (#6757)
* Revert "gpu_hist performance tweaks (#5707)"

This reverts commit f779980f7ea7f6f07e86229b8e78144e8a74e6b3.

* Address reviewer's comment

* Fix build error
2021-03-18 13:56:10 -07:00
Jiaming Yuan
1a73a28511
Add device argsort. (#6749)
This is part of https://github.com/dmlc/xgboost/pull/6747 .
2021-03-16 16:05:22 +08:00
Philip Hyunsu Cho
366f3cb9d8
Add use_rmm flag to global configuration (#6656)
* Ensure RMM is 0.18 or later

* Add use_rmm flag to global configuration

* Modify XGBCachingDeviceAllocatorImpl to skip CUB when use_rmm=True

* Update the demo

* [CI] Pin NumPy to 1.19.4, since NumPy 1.19.5 doesn't work with latest Shap
2021-03-09 14:53:05 -08:00
Louis Desreumaux
9b530e5697
Improve OpenMP exception handling (#6680) 2021-02-25 13:56:16 +08:00
Jiaming Yuan
1b70a323a7
Improve string view to reduce string allocation. (#6644) 2021-01-27 19:08:52 +08:00
Jiaming Yuan
f2f7dd87b8
Use view for SparsePage exclusively. (#6590) 2021-01-11 18:04:55 +08:00
Philip Hyunsu Cho
bf6cfe3b99
[Breaking] Upgrade cuDF and RMM to 0.18 nightlies; require RMM 0.18+ for RMM plugin (#6510)
* [CI] Upgrade cuDF and RMM to 0.18 nightlies

* Modify RMM plugin to be compatible with RMM 0.18

* Update src/common/device_helpers.cuh

Co-authored-by: Mark Harris <mharris@nvidia.com>

Co-authored-by: Mark Harris <mharris@nvidia.com>
2020-12-16 10:07:52 -08:00
Jiaming Yuan
886486a519
Support categorical data in GPU weighted sketching. (#6508) 2020-12-16 14:23:28 +08:00
ShvetsKS
956beead70
Thread local memory allocation for BuildHist (#6358)
* thread mem locality

* fix apply

* cleanup

* fix lint

* fix tests

* simple try

* fix

* fix

* apply comments

* fix comments

* fix

* apply simple comment

Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
2020-11-25 17:50:12 +03:00