995 Commits

Author SHA1 Message Date
Jiaming Yuan
bed7ae4083
Loop over thrust::reduce. (#6229)
* Check input chunk size of dqdm.
* Add doc for current limitation.
2020-10-14 10:40:56 +13:00
Rory Mitchell
734a911a26
Loop over copy_if (#6201)
* Loop over copy_if

* Catch OOM.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-10-14 10:23:16 +13:00
Jiaming Yuan
b05073bda5
[dask] Test for data initializaton. (#6226) 2020-10-13 11:08:35 +08:00
Jiaming Yuan
70c2039748
Catch all standard exceptions in C API. (#6220)
* `std::bad_alloc` is not guaranteed to be caught.
2020-10-12 14:01:46 +08:00
Jiaming Yuan
2241563f23
Handle duplicated values in sketching. (#6178)
* Accumulate weights in duplicated values.
* Fix device id in iterative dmatrix.
2020-10-10 19:32:44 +08:00
Jiaming Yuan
b5b24354b8
More categorical tests and disable shap sparse test. (#6219)
* Fix tree load with 32 category.
2020-10-10 16:12:37 +08:00
Jiaming Yuan
70ce5216b5
Add high level tests for categorical data. (#6179)
* Fix unique.
2020-10-09 09:27:23 +08:00
vcarpani
6bc9747df5
Reduce compile warnings (#6198)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-08 23:14:59 +08:00
ShvetsKS
a4ce0eae43
CPU predict performance improvement (#6127)
Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
2020-10-08 15:50:21 +03:00
Christian Lorentzen
cf4f019ed6
[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183)
* Change DefaultEvalMetric of classification from error to logloss

* Change default binary metric in plugin/example/custom_obj.cc

* Set old error metric in python tests

* Set old error metric in R tests

* Fix missed eval metrics and typos in R tests

* Fix setting eval_metric twice in R tests

* Add warning for empty eval_metric for classification

* Fix Dask tests

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-02 12:06:47 -07:00
Jiaming Yuan
f0c63902ff
Use default allocator in sketching. (#6182) 2020-09-30 14:55:59 +08:00
Jiaming Yuan
444131a2e6
Add categorical data support to GPU Hist. (#6164) 2020-09-29 11:27:25 +08:00
Jiaming Yuan
798af22ff4
Add categorical data support to GPU predictor. (#6165) 2020-09-29 11:25:34 +08:00
Jiaming Yuan
52c0b3f100
Fix error message. (#6176) 2020-09-29 11:18:25 +08:00
Rory Mitchell
dda9e1e487
Update GPUTreeshap (#6163)
* Reduce shap test duration

* Test interoperability with shap package

* Add feature interactions

* Update GPUTreeShap
2020-09-28 09:43:47 +13:00
Jiaming Yuan
07355599c2
Option for generating device debug info. (#6168)
* Supply `-G;-src-in-ptx` when `USE_DEVICE_DEBUG` is set and debug mode is selected.
* Refactor CMake script to gather all CUDA configuration.
* Use CMAKE_CUDA_ARCHITECTURES.  Close #6029.
* Add compute 80.  Close #5999
2020-09-27 03:26:56 +08:00
Philip Hyunsu Cho
72ef553550
Fall back to CUB allocator if RMM memory pool is not set up (#6150)
* Fall back to CUB allocator if RMM memory pool is not set up

* Fix build

* Prevent memory leak

* Add note about lack of memory initialisation

* Add check for other fast allocators

* Set use_cub_allocator_ to true when RMM is not enabled

* Fix clang-tidy

* Do not demangle symbol; add check to ensure Linux+Clang/GCC combo
2020-09-24 11:04:50 -07:00
Jiaming Yuan
14afdb4d92
Support categorical data in ellpack. (#6140) 2020-09-24 19:28:57 +08:00
Jiaming Yuan
7065779afa
Improve JSON format for categorical features. (#6128)
* Gather categories for all nodes.
2020-09-21 15:35:05 +08:00
Jiaming Yuan
210c131ce7
Support categorical data in GPU sketching. (#6137) 2020-09-21 13:53:06 +08:00
Jiaming Yuan
e319b63f9e
Merge extract cuts into QuantileContainer. (#6125)
* Use pruning for initial summary construction.
2020-09-18 16:36:39 +08:00
Jiaming Yuan
5384ed85c8
Use caching allocator from RMM, when RMM is enabled (#6131) 2020-09-17 21:51:49 -07:00
Philip Hyunsu Cho
9e955fb9b0
[R] Check warnings explicitly for model compatibility tests (#6114)
* [R] Check warnings explicitly for model compatibility tests

* Address reviewer's feedback
2020-09-15 10:49:48 -07:00
Philip Hyunsu Cho
33577ef5d3
Add MAPE metric (#6119) 2020-09-14 18:45:27 -07:00
Jiaming Yuan
b5f52f0b1b
Validate weights are positive values. (#6115) 2020-09-15 09:03:55 +08:00
Jiaming Yuan
c6f2b8c841
Upgrade gputreeshap. (#6099)
* Upgrade gputreeshap.

Co-authored-by: Rory Mitchell <r.a.mitchell.nz@gmail.com>
2020-09-15 12:57:22 +12:00
Philip Hyunsu Cho
d0ccb13d09
Work around a compiler bug in MacOS AppleClang 11 (#6103)
* Workaround a compiler bug in MacOS AppleClang

* [CI] Run C++ test with MacOS Catalina + AppleClang 11.0.3

* [CI] Migrate cmake_test on MacOS from Travis CI to GitHub Actions

* Install OpenMP runtime

* [CI] Use CMake to locate lz4 lib
2020-09-09 21:21:55 -07:00
Jiaming Yuan
93e9af43bb
Unify set index data. (#6062) 2020-09-08 11:38:41 +08:00
Jiaming Yuan
e5d40b39cd
[Breaking] Don't save leaf child count in JSON. (#6094)
The field is deprecated and not used anywhere in XGBoost.
2020-09-08 11:11:13 +08:00
Jiaming Yuan
5994f3b14c
Don't link imported target. (#6093) 2020-09-07 02:51:09 -07:00
Rory Mitchell
2e907abdb8
Updates to GPUTreeShap (#6087)
* Extract paths on device

* Update GPUTreeShap
2020-09-06 13:39:08 +12:00
Rory Mitchell
9bddecee05
Update GPUTreeShap (#6064)
* Update GPUTreeShap

* Update src/CMakeLists.txt

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-27 12:01:53 -07:00
Jiaming Yuan
2fcc4f2886
Unify evaluation functions. (#6037) 2020-08-26 14:23:27 +08:00
Jiaming Yuan
80c8547147
Make binary bin search reusable. (#6058)
* Move binary search row to hist util.
* Remove dead code.
2020-08-26 05:05:11 +08:00
Jiaming Yuan
20c95be625
Expand categorical node. (#6028)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-25 18:53:57 +08:00
Rory Mitchell
9a4e8b1d81
GPUTreeShap (#6038) 2020-08-25 12:47:41 +12:00
Jiaming Yuan
a144daf034
Limit tree depth for GPU hist. (#6045) 2020-08-22 19:34:52 +08:00
ShvetsKS
24f2e6c97e
Optimize DMatrix build time. (#5877)
Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
2020-08-20 01:37:03 +08:00
Jiaming Yuan
29b7fea572
Optimize cpu sketch allreduce for sparse data. (#6009)
* Bypass RABIT serialization reducer and use custom allgather based merging.
2020-08-19 10:03:45 +08:00
Qi Zhang
989ddd036f
Swap byte-order in binary serializer to support big-endian arch (#5813)
* fixed some endian issues

* Use dmlc::ByteSwap() to simplify code

* Fix lint check

* [CI] Add test for s390x

* Download latest CMake on s390x

* Fix a bug in my code

* Save magic number in dmatrix with byteswap on big-endian machine

* Save version in binary with byteswap on big-endian machine

* Load scalar with byteswap in MetaInfo

* Add a debugging message

* Handle arrays correctly when byteswapping

* EOF can also be 255

* Handle magic number in MetaInfo carefully

* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model

* Handle missing packages in Python tests

* Don't use boto3 in model compatibility tests

* Add s390 Docker file for local testing

* Add model compatibility tests

* Add R compatibility test

* Revert "Add R compatibility test"

This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.

Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 14:47:17 -07:00
Jiaming Yuan
4d99c58a5f
Feature weights (#5962) 2020-08-18 19:55:41 +08:00
Jiaming Yuan
d240463b38
Revert "Remove warning about memset. (#6003)" (#6020)
This reverts commit 12e3fb6a6cb601c58d39ed54253e6e50f1513ccc.
2020-08-17 20:10:15 +08:00
Jiaming Yuan
12e3fb6a6c
Remove warning about memset. (#6003) 2020-08-13 08:25:46 +08:00
Philip Hyunsu Cho
9adb812a0a
RMM integration plugin (#5873)
* [CI] Add RMM as an optional dependency

* Replace caching allocator with pool allocator from RMM

* Revert "Replace caching allocator with pool allocator from RMM"

This reverts commit e15845d4e72e890c2babe31a988b26503a7d9038.

* Use rmm::mr::get_default_resource()

* Try setting default resource (doesn't work yet)

* Allocate pool_mr in the heap

* Prevent leaking pool_mr handle

* Separate EXPECT_DEATH() in separate test suite suffixed DeathTest

* Turn off death tests for RMM

* Address reviewer's feedback

* Prevent leaking of cuda_mr

* Fix Jenkinsfile syntax

* Remove unnecessary function in Jenkinsfile

* [CI] Install NCCL into RMM container

* Run Python tests

* Try building with RMM, CUDA 10.0

* Do not use RMM for CUDA 10.0 target

* Actually test for test_rmm flag

* Fix TestPythonGPU

* Use CNMeM allocator, since pool allocator doesn't yet support multiGPU

* Use 10.0 container to build RMM-enabled XGBoost

* Revert "Use 10.0 container to build RMM-enabled XGBoost"

This reverts commit 789021fa31112e25b683aef39fff375403060141.

* Fix Jenkinsfile

* [CI] Assign larger /dev/shm to NCCL

* Use 10.2 artifact to run multi-GPU Python tests

* Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target

* Rename Conda env rmm_test -> gpu_test

* Use env var to opt into CNMeM pool for C++ tests

* Use identical CUDA version for RMM builds and tests

* Use Pytest fixtures to enable RMM pool in Python tests

* Move RMM to plugin/CMakeLists.txt; use PLUGIN_RMM

* Use per-device MR; use command arg in gtest

* Set CMake prefix path to use Conda env

* Use 0.15 nightly version of RMM

* Remove unnecessary header

* Fix a unit test when cudf is missing

* Add RMM demos

* Remove print()

* Use HostDeviceVector in GPU predictor

* Simplify pytest setup; use LocalCUDACluster fixture

* Address reviewers' commments

Co-authored-by: Hyunsu Cho <chohyu01@cs.wasshington.edu>
2020-08-12 01:26:02 -07:00
Jiaming Yuan
ee70a2380b
Unify CPU hist sketching (#5880) 2020-08-12 01:33:06 +08:00
Jiaming Yuan
6f7112a848
Move warning about empty dataset. (#5998) 2020-08-11 14:10:51 +08:00
Jiaming Yuan
0b2a26fa74
Remove skmaker. (#5971) 2020-08-09 15:23:31 +08:00
Vladislav Epifanov
388f975cf5
Introducing DPC++-based plugin (predictor, objective function) supporting oneAPI programming model (#5825)
* Added plugin with DPC++-based predictor and objective function

* Update CMakeLists.txt

* Update regression_obj_oneapi.cc

* Added README.md for OneAPI plugin

* Added OneAPI predictor support to gbtree

* Update README.md

* Merged kernels in gradient computation. Enabled multiple loss functions with DPC++ backend

* Aligned plugin CMake files with latest master changes. Fixed whitespace typos

* Removed debug output

* [CI] Make oneapi_plugin a CMake target

* Added tests for OneAPI plugin for predictor and obj. functions

* Temporarily switched to default selector for device dispacthing in OneAPI plugin to enable execution in environments without gpus

* Updated readme file.

* Fixed USM usage in predictor

* Removed workaround with explicit templated names for DPC++ kernels

* Fixed warnings in plugin tests

* Fix CMake build of gtest

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-08 18:40:40 -07:00
Jiaming Yuan
9c6e791e64
Enforce tree order in JSON. (#5974)
* Make JSON model IO more future proof by using tree id in model loading.
2020-08-05 16:44:52 +08:00
Philip Hyunsu Cho
3fcfaad577
Add CMake flag to log C API invocations, to aid debugging (#5925)
* Add CMake flag to log C API invocations, to aid debugging

* Remove unnecessary parentheses
2020-07-30 19:24:28 -07:00