604 Commits

Author SHA1 Message Date
Bobby Wang
8943eb4314
[BLOCKING] [jvm-packages] add gpu_hist and enable gpu scheduling (#5171)
* [jvm-packages] add gpu_hist tree method

* change updater hist to grow_quantile_histmaker

* add gpu scheduling

* pass correct parameters to xgboost library

* remove debug info

* add use.cuda for pom

* add CI for gpu_hist for jvm

* add gpu unit tests

* use gpu node to build jvm

* use nvidia-docker

* Add CLI interface to create_jni.py using argparse

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-26 21:53:24 -07:00
Jiaming Yuan
40361043ae
[BLOCKING] Remove to_string. (#5934) 2020-07-26 10:21:26 +08:00
Philip Hyunsu Cho
12110c900e
[CI] Make Python model compatibility test runnable locally (#5941) 2020-07-25 16:58:02 -07:00
Philip Hyunsu Cho
487ab0ce73
[BLOCKING] Handle empty rows in data iterators correctly (#5929)
* [jvm-packages] Handle empty rows in data iterators correctly

* Fix clang-tidy error

* last empty row

* Add comments [skip ci]

Co-authored-by: Nan Zhu <nanzhu@uber.com>
2020-07-25 13:46:19 -07:00
Jiaming Yuan
bc1d3ee230
Fix r early stop with custom objective. (#5923)
* Specify `ntreelimit`.
2020-07-23 03:28:17 +08:00
Jiaming Yuan
66cc1e02aa
Setup github action. (#5917) 2020-07-22 15:05:25 +08:00
Philip Hyunsu Cho
627cf41a60
Add option to enable all compiler warnings in GCC/Clang (#5897)
* Add option to enable all compiler warnings in GCC/Clang

* Fix -Wall for CUDA sources

* Make -Wall private req for xgboost-r
2020-07-21 23:34:03 -07:00
Jiaming Yuan
9b688aca3b
Fix mingw build with R. (#5918) 2020-07-22 02:56:49 +08:00
Andy Adinets
b3d2e7644a
Support building XGBoost with CUDA 11 (#5808)
* Change serialization test.
* Add CUDA 11 tests on Linux CI.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-20 07:58:41 +08:00
Philip Hyunsu Cho
ac9136ee49
Further improvements and savings in Jenkins pipeline (#5904)
* Publish artifacts only on the master and release branches

* Build CUDA only for Compute Capability 7.5 when building PRs

* Run all Windows jobs in a single worker image

* Build nightly XGBoost4J SNAPSHOT JARs with Scala 2.12 only

* Show skipped Python tests on Windows

* Make Graphviz optional for Python tests

* Add back C++ tests

* Unstash xgboost_cpp_tests

* Fix label to CUDA 10.1

* Install cuPy for CUDA 10.1

* Install jsonschema

* Address reviewer's feedback
2020-07-18 03:30:40 -07:00
Philip Hyunsu Cho
71b0528a2f
GPU implementation of AFT survival objective and metric (#5714)
* Add interval accuracy

* De-virtualize AFT functions

* Lint

* Refactor AFT metric using GPU-CPU reducer

* Fix R build

* Fix build on Windows

* Fix copyright header

* Clang-tidy

* Fix crashing demo

* Fix typos in comment; explain GPU ID

* Remove unnecessary #include

* Add C++ test for interval accuracy

* Fix a bug in accuracy metric: use log pred

* Refactor AFT objective using GPU-CPU Transform

* Lint

* Fix lint

* Use Ninja to speed up build

* Use time, not /usr/bin/time

* Add cpu_build worker class, with concurrency = 1

* Use concurrency = 1 only for CUDA build

* concurrency = 1 for clang-tidy

* Address reviewer's feedback

* Update link to AFT paper
2020-07-17 01:18:13 -07:00
Jiaming Yuan
7c2686146e
Dask device dmatrix (#5901)
* Fix softprob with empty dmatrix.
2020-07-17 13:17:43 +08:00
Jiaming Yuan
e471056ec4
Fix sketch size calculation. (#5898) 2020-07-17 08:33:16 +08:00
Bobby Wang
730866a7bc
[CI] update spark version to 3.0.0 (#5890)
* [CI] update spark version to 3.0.0

* Update Dockerfile.jvm_cross

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-16 00:23:44 -07:00
Jiaming Yuan
029a8b533f
Simplify the data backends. (#5893) 2020-07-16 15:17:31 +08:00
Alexander Gugel
970b4b3fa2
Add XGBoosterGetNumFeature (#5856)
- add GetNumFeature to Learner
- add XGBoosterGetNumFeature to C API
- update c-api-demo accordingly
2020-07-13 23:25:17 -07:00
Philip Hyunsu Cho
e0c179c7cc
[CI] Enforce daily budget in Jenkins CI (#5884)
* [CI] Throttle Jenkins CI

* Don't use Jenkins master instance
2020-07-13 21:51:11 -07:00
Jiaming Yuan
dd445af56e
Cleanup on device sketch. (#5874)
* Remove old functions.

* Merge weighted and un-weighted into a common interface.
2020-07-14 10:15:54 +08:00
Philip Hyunsu Cho
0d411b0397
[CI] Simplify CMake build with modern CMake techniques (#5871)
* [CI] Simplify CMake build

* Make sure that plugins can be built

* [CI] Install lz4 on Mac
2020-07-08 04:23:24 -07:00
Rong Ou
06320729d4
fix device sketch with weights in external memory mode (#5870) 2020-07-08 08:44:07 +08:00
Jiaming Yuan
a3ec964346
Accept iterator in device dmatrix. (#5783)
* Remove Device DMatrix.
2020-07-07 21:44:48 +08:00
Jiaming Yuan
048d969be4
Implement GK sketching on GPU. (#5846)
* Implement GK sketching on GPU.
* Strong tests on quantile building.
* Handle sparse dataset by binary searching the column index.
* Hypothesis test on dask.
2020-07-07 12:16:21 +08:00
Andy Adinets
ac3f0e78dc
Split Features into Groups to Compute Histograms in Shared Memory (#5795) 2020-07-07 15:04:35 +12:00
Jiaming Yuan
93c44a9a64
Move feature names and types of DMatrix from Python to C++. (#5858)
* Add thread local return entry for DMatrix.
* Save feature name and feature type in binary file.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-07 09:40:13 +08:00
Jiaming Yuan
4b0852ee41
Use dmlc stream when URI protocol is not local file. (#5857) 2020-07-07 03:07:12 +08:00
Jiaming Yuan
1a0801238e
Implement iterative DMatrix. (#5837) 2020-07-03 11:44:52 +08:00
Jiaming Yuan
4d277d750d
Relax linear test. (#5849)
* Increased error in coordinate is mostly due to floating point error.
* Shotgun uses Hogwild!, which is non-deterministic and can have even greater
floating point error.
2020-07-03 07:49:53 +08:00
Jiaming Yuan
eb067c1c34
Relax test for shotgun. (#5835) 2020-07-01 19:20:29 +08:00
Jiaming Yuan
90a9c68874
Implement a DMatrix Proxy. (#5803) 2020-06-29 15:03:10 +08:00
Jiaming Yuan
47c89775d6
Accept string for ArrayInterface constructor. (#5799) 2020-06-27 00:06:54 +08:00
Jiaming Yuan
c4d721200a
Implement extend method for meta info. (#5800)
* Implement extend for host device vector.
2020-06-20 03:32:03 +08:00
Philip Hyunsu Cho
a6d9a06b7b
[CI] Fix cuDF install; merge 'gpu' and 'cudf' test suite (#5814) 2020-06-19 16:42:57 +08:00
Philip Hyunsu Cho
a67bc64819
Add an option to run brute-force test for JSON round-trip (#5804)
* Add an option to run brute-force test for JSON round-trip

* Apply reviewer's feedback

* Remove unneeded objects

* Parallel run.

* Max.

* Use signed 64-bit loop var, to support MSVC

* Add exhaustive test to CI

* Run JSON test in Win build worker

* Revert "Run JSON test in Win build worker"

This reverts commit c97b2c7dda37b3585b445d36961605b79552ca89.

* Revert "Add exhaustive test to CI"

This reverts commit c149c2ce9971a07a7289f9b9bc247818afd5a667.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-06-17 23:46:02 -07:00
Rory Mitchell
abdf894fcf
Add cupy to Windows CI (#5797)
* Add cupy to Windows CI

* Update Jenkinsfile-win64

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* Update Jenkinsfile-win64

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* Update tests/python-gpu/test_gpu_prediction.py

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-06-17 21:55:09 -07:00
Jiaming Yuan
38ee514787
Implement fast number serialization routines. (#5772)
* Implement ryu algorithm.
* Implement integer printing.
* Full coverage roundtrip test.
2020-06-17 12:39:23 +08:00
fis
7c3a168ffd Revert "Accept string for ArrayInterface constructor."
This reverts commit e8ecafb8dc628f45b75b4c2844a236d27e0a6d98.
2020-06-16 20:02:35 +08:00
fis
e8ecafb8dc Accept string for ArrayInterface constructor. 2020-06-16 20:00:24 +08:00
Rory Mitchell
b47b5ac771
Use hypothesis (#5759)
* Use hypothesis

* Allow int64 array interface for groups

* Add packages to Windows CI

* Add to travis

* Make sure device index is set correctly

* Fix dask-cudf test

* appveyor
2020-06-16 12:45:59 +12:00
Alex
ae18a094b0
Add new skl model attribute for number of features (#5780) 2020-06-15 18:01:59 +08:00
Jiaming Yuan
1fa84b61c1
Implement Empty method for host device vector. (#5781)
* Fix accessing nullptr.
2020-06-13 19:02:26 +08:00
Jiaming Yuan
306e38ff31
Avoid including c_api.h in header files. (#5782) 2020-06-12 16:24:24 +08:00
Jiaming Yuan
3028fa6b42
Implement weighted sketching for adapter. (#5760)
* Bounded memory tests.
* Fixed memory estimation.
2020-06-12 06:20:39 +08:00
James Lamb
c96e1ef283
[python-package] remove unused imports (#5776) 2020-06-11 16:50:27 +08:00
Jiaming Yuan
cacff9232a
Remove column major specialization. (#5755)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-06-05 16:19:14 +08:00
Jiaming Yuan
bd9d57f579
Add helper for generating batches of data. (#5756)
* Add helper for generating batches of data.

* VC keyword clash.

* Another clash.
2020-06-05 09:53:56 +08:00
Rory Mitchell
359023c0fa
Speed up python test (#5752)
* Speed up tests

* Prevent DeviceQuantileDMatrix initialisation with numpy

* Use joblib.memory

* Use RandomState
2020-06-05 11:39:24 +12:00
ShvetsKS
cd3d14ad0e
Add float32 histogram (#5624)
* new single_precision_histogram param was added.

Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
Co-authored-by: fis <jm.yuan@outlook.com>
2020-06-03 11:24:53 +08:00
Jiaming Yuan
e49607af19
Add Python binding for rabit ops. (#5743) 2020-06-02 19:47:23 +08:00
Jiaming Yuan
e533908922
Expose device sketching in header. (#5747) 2020-06-02 13:02:53 +08:00
Jiaming Yuan
9e1b29944e
Fix loading old model. (#5724)
* Add test.
2020-05-31 14:55:32 +08:00