638 Commits

Author SHA1 Message Date
Jiaming Yuan
c92d751ad1
Enable building rabit on Windows (#6105) 2020-09-11 11:54:46 +08:00
Philip Hyunsu Cho
d0ccb13d09
Work around a compiler bug in MacOS AppleClang 11 (#6103)
* Workaround a compiler bug in MacOS AppleClang

* [CI] Run C++ test with MacOS Catalina + AppleClang 11.0.3

* [CI] Migrate cmake_test on MacOS from Travis CI to GitHub Actions

* Install OpenMP runtime

* [CI] Use CMake to locate lz4 lib
2020-09-09 21:21:55 -07:00
Jiaming Yuan
3dcd85fab5
Refactor rabit tests (#6096)
* Merge rabit tests into XGBoost.
* Run them On CI.
* Simplification for CMake scripts.
2020-09-09 12:30:29 +08:00
ShvetsKS
c1ca872d1e
Modin DF support (#6055)
* Modin DF support

* mode change

* tests were added, ci env was extended

* mode change

* Remove redundant installation of modin

* Add a pytest skip marker for modin

* Install Modin[ray] from PyPI

* fix interfering

* avoid extra conversion

* delete cv test for modin

* revert cv function

Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-29 22:33:30 +03:00
Jiaming Yuan
2fcc4f2886
Unify evaluation functions. (#6037) 2020-08-26 14:23:27 +08:00
Jiaming Yuan
20c95be625
Expand categorical node. (#6028)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-25 18:53:57 +08:00
Rory Mitchell
9a4e8b1d81
GPUTreeShap (#6038) 2020-08-25 12:47:41 +12:00
Philip Hyunsu Cho
cfced58c1c
[CI] Port CI fixes from the 1.2.0 branch (#6050)
* Fix a unit test on CLI, to handle RC versions

* [CI] Use mgpu machine to run gpu hist unit tests

* [CI] Build GPU-enabled JAR artifact and deploy to xgboost-maven-repo
2020-08-22 23:24:46 -07:00
Jiaming Yuan
a144daf034
Limit tree depth for GPU hist. (#6045) 2020-08-22 19:34:52 +08:00
Jiaming Yuan
b9ebbffc57
Fix plotting test. (#6040)
Previously the test loads a model generated by `test_basic.py`, now we generate
the model explicitly.

* Cleanup saved files for basic tests.
2020-08-22 13:18:48 +08:00
Philip Hyunsu Cho
1fd29edf66
[CI] Migrate linters to GitHub Actions (#6035)
* [CI] Move lint to GitHub Actions

* [CI] Move Doxygen to GitHub Actions

* [CI] Move Sphinx build test to GitHub Actions

* [CI] Reduce workload for Windows R tests

* [CI] Move clang-tidy to Build stage
2020-08-19 12:33:51 -07:00
Jiaming Yuan
29b7fea572
Optimize cpu sketch allreduce for sparse data. (#6009)
* Bypass RABIT serialization reducer and use custom allgather based merging.
2020-08-19 10:03:45 +08:00
Jiaming Yuan
90355b4f00
Make JSON the default full serialization format. (#6027) 2020-08-19 09:57:43 +08:00
Qi Zhang
989ddd036f
Swap byte-order in binary serializer to support big-endian arch (#5813)
* fixed some endian issues

* Use dmlc::ByteSwap() to simplify code

* Fix lint check

* [CI] Add test for s390x

* Download latest CMake on s390x

* Fix a bug in my code

* Save magic number in dmatrix with byteswap on big-endian machine

* Save version in binary with byteswap on big-endian machine

* Load scalar with byteswap in MetaInfo

* Add a debugging message

* Handle arrays correctly when byteswapping

* EOF can also be 255

* Handle magic number in MetaInfo carefully

* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model

* Handle missing packages in Python tests

* Don't use boto3 in model compatibility tests

* Add s390 Docker file for local testing

* Add model compatibility tests

* Add R compatibility test

* Revert "Add R compatibility test"

This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.

Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 14:47:17 -07:00
Jiaming Yuan
4d99c58a5f
Feature weights (#5962) 2020-08-18 19:55:41 +08:00
Philip Hyunsu Cho
14d5ce712c
[CI] Fix Dask Pytest fixture (#6024) 2020-08-17 16:45:22 -07:00
Jiaming Yuan
674c409e9d
Remove rabit dependency on public headers. (#6005) 2020-08-13 08:26:20 +08:00
Philip Hyunsu Cho
9adb812a0a
RMM integration plugin (#5873)
* [CI] Add RMM as an optional dependency

* Replace caching allocator with pool allocator from RMM

* Revert "Replace caching allocator with pool allocator from RMM"

This reverts commit e15845d4e72e890c2babe31a988b26503a7d9038.

* Use rmm::mr::get_default_resource()

* Try setting default resource (doesn't work yet)

* Allocate pool_mr in the heap

* Prevent leaking pool_mr handle

* Separate EXPECT_DEATH() in separate test suite suffixed DeathTest

* Turn off death tests for RMM

* Address reviewer's feedback

* Prevent leaking of cuda_mr

* Fix Jenkinsfile syntax

* Remove unnecessary function in Jenkinsfile

* [CI] Install NCCL into RMM container

* Run Python tests

* Try building with RMM, CUDA 10.0

* Do not use RMM for CUDA 10.0 target

* Actually test for test_rmm flag

* Fix TestPythonGPU

* Use CNMeM allocator, since pool allocator doesn't yet support multiGPU

* Use 10.0 container to build RMM-enabled XGBoost

* Revert "Use 10.0 container to build RMM-enabled XGBoost"

This reverts commit 789021fa31112e25b683aef39fff375403060141.

* Fix Jenkinsfile

* [CI] Assign larger /dev/shm to NCCL

* Use 10.2 artifact to run multi-GPU Python tests

* Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target

* Rename Conda env rmm_test -> gpu_test

* Use env var to opt into CNMeM pool for C++ tests

* Use identical CUDA version for RMM builds and tests

* Use Pytest fixtures to enable RMM pool in Python tests

* Move RMM to plugin/CMakeLists.txt; use PLUGIN_RMM

* Use per-device MR; use command arg in gtest

* Set CMake prefix path to use Conda env

* Use 0.15 nightly version of RMM

* Remove unnecessary header

* Fix a unit test when cudf is missing

* Add RMM demos

* Remove print()

* Use HostDeviceVector in GPU predictor

* Simplify pytest setup; use LocalCUDACluster fixture

* Address reviewers' commments

Co-authored-by: Hyunsu Cho <chohyu01@cs.wasshington.edu>
2020-08-12 01:26:02 -07:00
Jiaming Yuan
ee70a2380b
Unify CPU hist sketching (#5880) 2020-08-12 01:33:06 +08:00
jameskrach
bd6b7f4aa7
[Breaking] Fix .predict() method and add .predict_proba() in xgboost.dask.DaskXGBClassifier (#5986) 2020-08-11 16:11:28 +08:00
Vladislav Epifanov
388f975cf5
Introducing DPC++-based plugin (predictor, objective function) supporting oneAPI programming model (#5825)
* Added plugin with DPC++-based predictor and objective function

* Update CMakeLists.txt

* Update regression_obj_oneapi.cc

* Added README.md for OneAPI plugin

* Added OneAPI predictor support to gbtree

* Update README.md

* Merged kernels in gradient computation. Enabled multiple loss functions with DPC++ backend

* Aligned plugin CMake files with latest master changes. Fixed whitespace typos

* Removed debug output

* [CI] Make oneapi_plugin a CMake target

* Added tests for OneAPI plugin for predictor and obj. functions

* Temporarily switched to default selector for device dispacthing in OneAPI plugin to enable execution in environments without gpus

* Updated readme file.

* Fixed USM usage in predictor

* Removed workaround with explicit templated names for DPC++ kernels

* Fixed warnings in plugin tests

* Fix CMake build of gtest

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-08 18:40:40 -07:00
Jiaming Yuan
801e6b6800
Fix dask predict shape infer. (#5989) 2020-08-08 14:29:22 +08:00
Jiaming Yuan
9c6e791e64
Enforce tree order in JSON. (#5974)
* Make JSON model IO more future proof by using tree id in model loading.
2020-08-05 16:44:52 +08:00
Jiaming Yuan
dde9c5aaff
Fix missing data warning. (#5969)
* Fix data warning.

* Add numpy/scipy test.
2020-08-05 16:19:12 +08:00
Jiaming Yuan
8599f87597
Update JSON schema. (#5982)
* Update JSON schema for pseudo huber.
* Update JSON model schema.
2020-08-05 15:21:11 +08:00
Jiaming Yuan
9c93531709
Update Python custom objective demo. (#5981) 2020-08-05 12:27:19 +08:00
Philip Hyunsu Cho
bf2990e773
Add missing Pytest marks to AsyncIO unit test (#5968) 2020-08-01 10:56:24 +08:00
boxdot
d268a2a463
Thread-safe prediction by making the prediction cache thread-local. (#5853)
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-07-30 12:33:50 +08:00
Jiaming Yuan
fa3715f584
[Dask] Asyncio support. (#5862) 2020-07-30 06:23:58 +08:00
Philip Hyunsu Cho
071e10c1d1
[CI] Fix broken Docker container 'cpu' (#5956) 2020-07-29 04:29:57 -07:00
Jiaming Yuan
f5fdcbe194
Disable feature validation on sklearn predict prob. (#5953)
* Fix issue when scikit learn interface receives transformed inputs.
2020-07-29 19:26:44 +08:00
Jiaming Yuan
18349a7ccf
[Breaking] Fix custom metric for multi output. (#5954)
* Set output margin to true for custom metric.  This fixes only R and Python.
2020-07-29 19:25:27 +08:00
Jiaming Yuan
75b8c22b0b
Fix prediction heuristic (#5955)
* Relax check for prediction.
* Relax test in spark test.
* Add tests in C++.
2020-07-29 19:24:07 +08:00
Philip Hyunsu Cho
5879acde9a
[CI] Improve R linter script (#5944)
* [CI] Move lint to a separate script

* [CI] Improved lintr launcher

* Add lintr as a separate action

* Add custom parsing logic to print out logs

* Fix lintr issues in demos

* Run R demos

* Fix CRAN checks

* Install XGBoost into R env before running lintr

* Install devtools (needed to run demos)
2020-07-27 00:55:35 -07:00
Bobby Wang
8943eb4314
[BLOCKING] [jvm-packages] add gpu_hist and enable gpu scheduling (#5171)
* [jvm-packages] add gpu_hist tree method

* change updater hist to grow_quantile_histmaker

* add gpu scheduling

* pass correct parameters to xgboost library

* remove debug info

* add use.cuda for pom

* add CI for gpu_hist for jvm

* add gpu unit tests

* use gpu node to build jvm

* use nvidia-docker

* Add CLI interface to create_jni.py using argparse

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-26 21:53:24 -07:00
Jiaming Yuan
40361043ae
[BLOCKING] Remove to_string. (#5934) 2020-07-26 10:21:26 +08:00
Philip Hyunsu Cho
12110c900e
[CI] Make Python model compatibility test runnable locally (#5941) 2020-07-25 16:58:02 -07:00
Philip Hyunsu Cho
487ab0ce73
[BLOCKING] Handle empty rows in data iterators correctly (#5929)
* [jvm-packages] Handle empty rows in data iterators correctly

* Fix clang-tidy error

* last empty row

* Add comments [skip ci]

Co-authored-by: Nan Zhu <nanzhu@uber.com>
2020-07-25 13:46:19 -07:00
Jiaming Yuan
bc1d3ee230
Fix r early stop with custom objective. (#5923)
* Specify `ntreelimit`.
2020-07-23 03:28:17 +08:00
Jiaming Yuan
66cc1e02aa
Setup github action. (#5917) 2020-07-22 15:05:25 +08:00
Philip Hyunsu Cho
627cf41a60
Add option to enable all compiler warnings in GCC/Clang (#5897)
* Add option to enable all compiler warnings in GCC/Clang

* Fix -Wall for CUDA sources

* Make -Wall private req for xgboost-r
2020-07-21 23:34:03 -07:00
Jiaming Yuan
9b688aca3b
Fix mingw build with R. (#5918) 2020-07-22 02:56:49 +08:00
Andy Adinets
b3d2e7644a
Support building XGBoost with CUDA 11 (#5808)
* Change serialization test.
* Add CUDA 11 tests on Linux CI.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-20 07:58:41 +08:00
Philip Hyunsu Cho
ac9136ee49
Further improvements and savings in Jenkins pipeline (#5904)
* Publish artifacts only on the master and release branches

* Build CUDA only for Compute Capability 7.5 when building PRs

* Run all Windows jobs in a single worker image

* Build nightly XGBoost4J SNAPSHOT JARs with Scala 2.12 only

* Show skipped Python tests on Windows

* Make Graphviz optional for Python tests

* Add back C++ tests

* Unstash xgboost_cpp_tests

* Fix label to CUDA 10.1

* Install cuPy for CUDA 10.1

* Install jsonschema

* Address reviewer's feedback
2020-07-18 03:30:40 -07:00
Philip Hyunsu Cho
71b0528a2f
GPU implementation of AFT survival objective and metric (#5714)
* Add interval accuracy

* De-virtualize AFT functions

* Lint

* Refactor AFT metric using GPU-CPU reducer

* Fix R build

* Fix build on Windows

* Fix copyright header

* Clang-tidy

* Fix crashing demo

* Fix typos in comment; explain GPU ID

* Remove unnecessary #include

* Add C++ test for interval accuracy

* Fix a bug in accuracy metric: use log pred

* Refactor AFT objective using GPU-CPU Transform

* Lint

* Fix lint

* Use Ninja to speed up build

* Use time, not /usr/bin/time

* Add cpu_build worker class, with concurrency = 1

* Use concurrency = 1 only for CUDA build

* concurrency = 1 for clang-tidy

* Address reviewer's feedback

* Update link to AFT paper
2020-07-17 01:18:13 -07:00
Jiaming Yuan
7c2686146e
Dask device dmatrix (#5901)
* Fix softprob with empty dmatrix.
2020-07-17 13:17:43 +08:00
Jiaming Yuan
e471056ec4
Fix sketch size calculation. (#5898) 2020-07-17 08:33:16 +08:00
Bobby Wang
730866a7bc
[CI] update spark version to 3.0.0 (#5890)
* [CI] update spark version to 3.0.0

* Update Dockerfile.jvm_cross

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-16 00:23:44 -07:00
Jiaming Yuan
029a8b533f
Simplify the data backends. (#5893) 2020-07-16 15:17:31 +08:00
Alexander Gugel
970b4b3fa2
Add XGBoosterGetNumFeature (#5856)
- add GetNumFeature to Learner
- add XGBoosterGetNumFeature to C API
- update c-api-demo accordingly
2020-07-13 23:25:17 -07:00