511 Commits

Author SHA1 Message Date
Rory Mitchell
443ff746e9
Fix logic in GPU predictor cache lookup (#3217)
* Fix logic in GPU predictor cache lookup

* Add sklearn test for GPU prediction
2018-04-04 15:08:22 +12:00
Rory Mitchell
a1ec7b1716
Change reduce operation from thrust to cub. Fix for cuda 9.1 error (#3218)
* Change reduce operation from thrust to cub. Fix for cuda 9.1 runtime error

* Unit test sum reduce
2018-04-04 14:21:48 +12:00
Arjan van der Velde
04221a7469 rank_metric: add AUC-PR (#3172)
* rank_metric: add AUC-PR

Implementation of the AUC-PR calculation for weighted data, proposed by Keilwagen, Grosse and Grau (https://doi.org/10.1371/journal.pone.0092209)

* rank_metric: fix lint warnings

* Implement tests for AUC-PR and fix implementation

* add aucpr to documentation for other languages
2018-03-23 10:43:47 -04:00
Will Storey
00d9728e4b Fix memory leak in XGDMatrixCreateFromMat_omp() (#3182)
* Fix memory leak in XGDMatrixCreateFromMat_omp()

This replaces the array allocated by new with a std::vector.

Fixes #3161
2018-03-18 15:03:27 +13:00
Rory Mitchell
9fa45d3a9c
Fix bug with gpu_predictor caching behaviour (#3177)
* Fixes #3162
2018-03-18 10:35:10 +13:00
Ray Kim
cdc036b752 Fixed performance bug (#3171)
Minor performance improvements to gpu predictor
2018-03-15 09:40:24 +13:00
Rory Mitchell
7a81c87dfa Fix incorrect minimum value in quantile generation (#3167) 2018-03-14 08:21:18 -07:00
Vadim Khotilovich
706be4e5d4
Additional improvements for gblinear (#3134)
* fix rebase conflict

* [core] additional gblinear improvements

* [R] callback for gblinear coefficients history

* force eta=1 for gblinear python tests

* add top_k to GreedyFeatureSelector

* set eta=1 in shotgun test

* [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater

* set sorted flag within TryInitColData

* gblinear tests: use scale, add external memory test

* fix multiclass for greedy updater

* fix whitespace

* fix typo
2018-03-13 01:27:13 -05:00
Andrew V. Adinetz
a1b48afa41 Added back UpdatePredictionCache() in updater_gpu_hist.cu. (#3120)
* Added back UpdatePredictionCache() in updater_gpu_hist.cu.

- it had been there before, but wasn't ported to the new version
  of updater_gpu_hist.cu
2018-03-09 15:06:45 +13:00
redditur
d5f1b74ef5 'hist': Montonic Constraints (#3085)
* Extended monotonic constraints support to 'hist' tree method.

* Added monotonic constraints tests.

* Fix the signature of NoConstraint::CalcSplitGain()

* Document monotonic constraint support in 'hist'

* Update signature of Update to account for latest refactor
2018-03-05 16:45:49 -08:00
Andrew V. Adinetz
d5992dd881 Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. (#3116)
* Replaced std::vector-based interfaces with HostDeviceVector-based interfaces.

- replacement was performed in the learner, boosters, predictors,
  updaters, and objective functions
- only interfaces used in training were replaced;
  interfaces like PredictInstance() still use std::vector
- refactoring necessary for replacement of interfaces was also performed,
  such as using HostDeviceVector in prediction cache

* HostDeviceVector-based interfaces for custom objective function example plugin.
2018-02-28 13:00:04 +13:00
Rory Mitchell
dd82b28e20
Update GPU code with dmatrix changes (#3117) 2018-02-17 12:11:48 +13:00
Rory Mitchell
10eb05a63a
Refactor linear modelling and add new coordinate descent updater (#3103)
* Refactor linear modelling and add new coordinate descent updater

* Allow unsorted column iterator

* Add prediction cacheing to gblinear
2018-02-17 09:17:01 +13:00
Vadim Khotilovich
9ffe8596f2
[core] fix slow predict-caching with many classes (#3109)
* fix prediction caching inefficiency for multiclass

* silence some warnings

* redundant if

* workaround for R v3.4.3 bug; fixes #3081
2018-02-15 18:31:42 -06:00
Abraham Zhan
874525c152 c_api.cc variable declared inapproiate (#3044)
In line 461, the "size_t offset = 0;" should be declared before any calculation, otherwise will cause compilation error. 

```
I:\Libraries\xgboost\src\c_api\c_api.cc(416): error C2146: Missing ";" before "offset" [I:\Libraries\xgboost\build\objxgboost.vcxproj]
```
2018-02-09 01:32:01 -08:00
Scott Lundberg
d878c36c84 Add SHAP interaction effects, fix minor bug, and add cox loss (#3043)
* Add interaction effects and cox loss

* Minimize whitespace changes

* Cox loss now no longer needs a pre-sorted dataset.

* Address code review comments

* Remove mem check, rename to pred_interactions, include bias

* Make lint happy

* More lint fixes

* Fix cox loss indexing

* Fix main effects and tests

* Fix lint

* Use half interaction values on the off-diagonals

* Fix lint again
2018-02-07 20:38:01 -06:00
Vadim Khotilovich
94e655329f
Replacing cout with LOG (#3076)
* change cout to LOG

* lint fix
2018-02-06 02:00:34 -06:00
Andrew V. Adinetz
24c2e41287 Fixed the bug with illegal memory access in test_large_sizes.py with 4 GPUs. (#3068)
- thrust::copy() called from dvec::copy() for gpairs invoked a GPU kernel instead of
  cudaMemcpy()
- this resulted in illegal memory access if the GPU running the kernel could not access
  the data being copied
- new version of dvec::copy() for thrust::device_ptr iterators calls cudaMemcpy(),
  avoiding the problem.
2018-02-01 16:54:46 +13:00
Rory Mitchell
f87802f00c
Fix GPU bugs (#3051)
* Change uint to unsigned int

* Fix no root predictions bug

* Remove redundant splitting due to numerical instability
2018-01-23 13:14:15 +13:00
Thejaswi
84ab74f3a5 Objective function evaluation on GPU with minimal PCIe transfers (#2935)
* Added GPU objective function and no-copy interface.

- xgboost::HostDeviceVector<T> syncs automatically between host and device
- no-copy interfaces have been added
- default implementations just sync the data to host
  and call the implementations with std::vector
- GPU objective function, predictor, histogram updater process data
  directly on GPU
2018-01-12 21:33:39 +13:00
PSEUDOTENSOR / Jonathan McKinney
4d36036fe6 Avoid repeated cuda API call in GPU predictor and only synchronize used GPUs (#2936) 2017-12-09 16:00:42 +13:00
Rory Mitchell
1b77903eeb
Fix several GPU bugs (#2916)
* Fix #2905

* Fix gpu_exact test failures

* Fix bug in GPU prediction where multiple calls to batch prediction can produce incorrect results

* Fix GPU documentation formatting
2017-12-04 08:27:49 +13:00
Rory Mitchell
c51adb49b6
Monotone constraints for gpu_hist (#2904) 2017-11-30 10:26:19 +13:00
EvanChong
790da458e7 Sync number of features after loaded matrix in different workers. (#2722) 2017-11-29 11:19:12 -08:00
Rory Mitchell
c55f14668e
Update gpu_hist algorithm (#2901) 2017-11-27 13:44:24 +13:00
Rory Mitchell
24f527a1c0
AVX gradients (#2878)
* AVX gradients

* Add google test for AVX

* Create fallback implementation, remove fma instruction

* Improved accuracy of AVX exp function
2017-11-27 08:56:01 +13:00
Rory Mitchell
40c6e2f0c8
Improved gpu_hist_experimental algorithm (#2866)
- Implement colsampling, subsampling for gpu_hist_experimental

 - Optimised multi-GPU implementation for gpu_hist_experimental

 - Make nccl optional

 - Add Volta architecture flag

 - Optimise RegLossObj

 - Add timing utilities for debug verbose mode

 - Bump required cuda version to 8.0
2017-11-11 13:58:40 +13:00
Rory Mitchell
d9d5293cdb Add warnings for large labels when using GPU histogram algorithms (#2834) 2017-10-26 17:31:10 +13:00
Rory Mitchell
13e7a2cff0 Various bug fixes (#2825)
* Fatal error if GPU algorithm selected without GPU support compiled

* Resolve type conversion warnings

* Fix gpu unit test failure

* Fix compressed iterator edge case

* Fix python unit test failures due to flake8 update on pip
2017-10-25 14:45:01 +13:00
Philip Cho
452063c32d Fix issue #2800 (#2817)
Problem:
Fast histogram updater crashes whenever subsampling picks zero rows

Diagnosis:
Row set data structure uses "nullptr" internally to indicate a non-existent
row set. Since you cannot take the address of the first element of an empty
vector, a valid row set ends up getting "nullptr" as well.

Fix:
Use an arbitrary value (not equal to "nullptr") to bypass nullptr check.
2017-10-23 10:46:25 -05:00
Qiang Luo
c09ad421a8 fix bug in loading config for pred task (#2795) 2017-10-20 00:10:14 -05:00
Scott Lundberg
78c4188cec SHAP values for feature contributions (#2438)
* SHAP values for feature contributions

* Fix commenting error

* New polynomial time SHAP value estimation algorithm

* Update API to support SHAP values

* Fix merge conflicts with updates in master

* Correct submodule hashes

* Fix variable sized stack allocation

* Make lint happy

* Add docs

* Fix typo

* Adjust tolerances

* Remove unneeded def

* Fixed cpp test setup

* Updated R API and cleaned up

* Fixed test typo
2017-10-12 12:35:51 -07:00
Rory Mitchell
4cb2f7598b -Add experimental GPU algorithm for lossguided mode (#2755)
-Improved GPU algorithm unit tests
-Removed some thrust code to improve compile times
2017-10-01 00:18:35 +13:00
Vadim Khotilovich
74db9757b3 [R package] GPU support (#2732)
* [R] MSVC compatibility

* [GPU] allow seed in BernoulliRng up to size_t and scale to uint32_t

* R package build with cmake and CUDA

* R package CUDA build fixes and cleanups

* always export the R package native initialization routine on windows

* update the install instructions doc

* fix lint

* use static_cast directly to set BernoulliRng seed

* [R] demo for GPU accelerated algorithm

* tidy up the R package cmake stuff

* R pack cmake: installs main dependency packages if needed

* [R] version bump in DESCRIPTION

* update NEWS

* added short missing/sparse values explanations to FAQ
2017-09-28 18:15:28 -05:00
Rory Mitchell
e6a9063344 Integer gradient summation for GPU histogram algorithm. (#2681) 2017-09-08 15:07:29 +12:00
Rory Mitchell
15267eedf2 [GPU-Plugin] Major refactor 2 (#2664)
* Change cmake option

* Move source files

* Move google tests

* Move python tests

* Move benchmarks

* Move documentation

* Remove makefile support

* Fix test run

* Move GPU tests
2017-09-08 09:57:16 +12:00
Rory Mitchell
19a53814ce [GPU-Plugin] Major refactor (#2644)
* Removal of redundant code/files.
* Removal of exact namespace in GPU plugin
* Revert double precision histograms to single precision for performance on Maxwell/Kepler
2017-08-30 10:53:52 +12:00
Rory Mitchell
ef23e424f1 [GPU-Plugin] Add GPU accelerated prediction (#2593)
* [GPU-Plugin] Add GPU accelerated prediction

* Improve allocation message

* Update documentation

* Resolve linker error for predictor

* Add unit tests
2017-08-16 12:31:59 +12:00
Vadim Khotilovich
2b3a4318c5 Several fixes (#2572)
* repared serialization after update process; fixes #2545

* non-stratified folds in python could omit some data instances

* Makefile: fixes for older makes on windows; clean R-package too

* make cub to be a shallow submodule

* improve $(MAKE) recovery
2017-08-06 13:03:50 -05:00
Rory Mitchell
eda9e180f0 [GPU-Plugin] Various fixes (#2579)
* Fix test large

* Add check for max_depth 0

* Update readme

* Add LBS specialisation for dense data

* Add bst_gpair_precise

* Temporarily disable accuracy tests on test_large.py

* Solve unused variable compiler warning

* Fix max_bin > 1024 error
2017-08-05 22:16:23 +12:00
Rory Mitchell
0e06d1805d [WIP] Extract prediction into separate interface (#2531)
* [WIP] Extract prediction into separate interface

* Add copyright, fix linter errors

* Add predictor to amalgamation

* Fix documentation

* Move prediction cache into predictor, add GBTreeModel

* Updated predictor doc comments
2017-07-28 17:01:03 -07:00
Vadim Khotilovich
00eda28b3c MinGW: shared library prefix and appveyor CI (#2539)
* for MinGW, drop the 'lib' prefix from shared library name

* fix defines for 'g++ 4.8 or higher' to include g++ >= 5

* fix compile warnings

* [Appveyor] add MinGW with python; remove redundant jobs

* [Appveyor] also do python build for one of msvc jobs
2017-07-25 01:06:47 -05:00
PSEUDOTENSOR / Jonathan McKinney
6b375f6ad8 Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530)
* Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.
2017-07-21 14:43:17 +12:00
PSEUDOTENSOR / Jonathan McKinney
ca7fc9fda3 [GPU-Plugin] Fix gpu_hist to allow matrices with more than just 2^{32} elements. Also fixed CPU hist algorithm. (#2518) 2017-07-18 11:19:27 +12:00
Rory Mitchell
530f01e21c [GPU-Plugin] Add load balancing search to gpu_hist. Add compressed iterator. (#2504) 2017-07-11 22:36:39 +12:00
Philip Cho
64c8f6fa6d Use old parallel algorithm for histogram construction by default (#2501)
It has been reported that new parallel algorithm (#2493) results in excessive
message usage (see issue #2326). Until issues are resolved, XGBoost should use
the old parallel algorithm by default. The user would have to specify
`enable_feature_grouping=1` manually to enable the new algorithm.
2017-07-10 09:35:48 -07:00
Vadim Khotilovich
7350085955 Fix broken make on windows (#2499)
* fix Makefile for make on windows

* clean up compilation warnings

* fix for `no file name for include` make warning
2017-07-08 09:17:31 -07:00
Philip Cho
ba820847f9 Patch to improve multithreaded performance scaling (#2493)
* Patch to improve multithreaded performance scaling

Change parallel strategy for histogram construction.
Instead of partitioning data rows among multiple threads, partition feature
columns instead. Useful heuristics for assigning partitions have been adopted
from LightGBM project.

* Add missing header to satisfy MSVC

* Restore max_bin and related parameters to TrainParam

* Fix lint error

* inline functions do not require static keyword

* Feature grouping algorithm accepting FastHistParam

Feature grouping algorithm accepts many parameters (3+), and it gets annoying to
pass them one by one. Instead, simply pass the reference to FastHistParam. The
definition of FastHistParam has been moved to a separate header file to
accomodate this change.
2017-07-07 08:25:07 -07:00
Rory Mitchell
5f1b0bb386 [GPU-Plugin] Unify gpu_gpair/bst_gpair. Refactor. (#2477) 2017-07-01 17:31:13 +12:00
PSEUDOTENSOR / Jonathan McKinney
6b287177c8 [GPU-Plugin] Multi-GPU gpu_id bug fixes for grow_gpu_hist and grow_gpu methods, and additional documentation for the gpu plugin. (#2463) 2017-06-30 20:04:17 +12:00