xgboost

Author	SHA1	Message	Date
Yun Ni	30d10ab035	Convert handle == nullptr from SegFault to user-friendly error. (#3021 ) * Convert SegFault to user-friendly error. * Apply the change to DMatrix API as well	2018-06-29 06:30:26 +00:00
pdesahb	12e34f32e2	Fix tweedie handling of base_score (#3295 ) * fix tweedie margin calculations * add entry to contributors	2018-06-28 15:43:05 +00:00
Henry Gouk	64b8cffde3	Refactor of FastHistMaker to allow for custom regularisation methods (#3335 ) * Refactor to allow for custom regularisation methods * Implement compositional SplitEvaluator framework * Fixed segfault when no monotone_constraints are supplied. * Change pid to parentID * test_monotone_constraints.py now passes * Refactor ColMaker and DistColMaker to use SplitEvaluator * Performance optimisation when no monotone_constraints specified * Fix linter messages * Fix a few more linter errors * Update the amalgamation * Add bounds check * Add check for leaf node * Fix linter error in param.h * Fix clang-tidy errors on CI * Fix incorrect function name * Fix clang-tidy error in updater_fast_hist.cc * Enable SSE2 for Win32 R MinGW Addresses https://github.com/dmlc/xgboost/pull/3335#issuecomment-400535752 * Add contributor	2018-06-28 07:37:25 +00:00
ngoyal2707	5cd851ccef	added code for instance based weighing for rank objectives (#3379 ) * added code for instance based weighing for rank objectives * Fix lint	2018-06-22 15:10:59 -07:00
PSEUDOTENSOR / Jonathan McKinney	9ac163d0bb	Allow import via python datatable. (#3272 ) * Allow import via python datatable. * Write unit tests * Refactor dt API functions * Refactor python code * Lint fixes * Address review comments	2018-06-20 13:16:18 -07:00
Thejaswi	0e78034607	Shared memory atomics while building histogram (#3384 ) * Use shared memory atomics for building histograms, whenever possible	2018-06-19 16:03:09 +12:00
Rory Mitchell	a96039141a	Dmatrix refactor stage 1 (#3301 ) * Use sparse page as singular CSR matrix representation * Simplify dmatrix methods * Reduce statefullness of batch iterators * BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.	2018-06-07 10:25:58 +12:00
Andy Adinets	286dccb8e8	GPU binning and compression. (#3319 ) * GPU binning and compression. - binning and index compression are done inside the DeviceShard constructor - in case of a DMatrix with multiple row batches, it is first converted into a single row batch	2018-06-05 17:15:13 +12:00
Philip Hyunsu Cho	bd01acdfbc	Save outputs in high precision in CLI prediction (#3356 ) Currently, `CLIPredict()` saves prediction results in default 6-digit precision which causes precision loss. This PR sets precision to a level so that the conversion back to `bst_float` is lossless. Related: #3298.	2018-06-03 14:15:47 -07:00
Thejaswi	d367e4fc6b	Fix for issue 3306. (#3324 )	2018-05-23 13:42:20 +12:00
Samuel O. Ronsin	cc79a65ab9	Increase precision of bst_float values in tree dumps (#3298 ) * Increase precision of bst_float values in tree dumps * Increase precision of bst_float values in tree dumps * Fix lint error and switch precision to right float variable * Fix clang-tidy error	2018-05-09 14:12:21 -07:00
Rory Mitchell	088bb4b27c	Prevent multiclass Hessian approaching 0 (#3304 ) * Prevent Hessian in multiclass objective becoming zero * Set default learning rate to 0.5 for "coord_descent" linear updater	2018-05-09 20:25:51 +12:00
Andrew V. Adinetz	b8a0d66fe6	Multi-GPU HostDeviceVector. (#3287 ) * Multi-GPU HostDeviceVector. - HostDeviceVector instances can now span multiple devices, defined by GPUSet struct - the interface of HostDeviceVector has been modified accordingly - GPU objective functions are now multi-GPU - GPU predicting from cache is now multi-GPU - avoiding omp_set_num_threads() calls - other minor changes	2018-05-05 08:00:05 +12:00
Thejaswi	c80d51ccb3	Fix issue #3264 , accuracy issues on k80 GPUs. (#3293 )	2018-05-04 13:14:08 +12:00
Rory Mitchell	a185ddfe03	Implement GPU accelerated coordinate descent algorithm (#3178 ) * Implement GPU accelerated coordinate descent algorithm. * Exclude external memory tests for GPU	2018-04-20 14:56:35 +12:00
Rory Mitchell	ccf80703ef	Clang-tidy static analysis (#3222 ) * Clang-tidy static analysis * Modernise checks * Google coding standard checks * Identifier renaming according to Google style	2018-04-19 18:57:13 +12:00
Rory Mitchell	443ff746e9	Fix logic in GPU predictor cache lookup (#3217 ) * Fix logic in GPU predictor cache lookup * Add sklearn test for GPU prediction	2018-04-04 15:08:22 +12:00
Rory Mitchell	a1ec7b1716	Change reduce operation from thrust to cub. Fix for cuda 9.1 error (#3218 ) * Change reduce operation from thrust to cub. Fix for cuda 9.1 runtime error * Unit test sum reduce	2018-04-04 14:21:48 +12:00
Arjan van der Velde	04221a7469	rank_metric: add AUC-PR (#3172 ) * rank_metric: add AUC-PR Implementation of the AUC-PR calculation for weighted data, proposed by Keilwagen, Grosse and Grau (https://doi.org/10.1371/journal.pone.0092209) * rank_metric: fix lint warnings * Implement tests for AUC-PR and fix implementation * add aucpr to documentation for other languages	2018-03-23 10:43:47 -04:00
Will Storey	00d9728e4b	Fix memory leak in XGDMatrixCreateFromMat_omp() (#3182 ) * Fix memory leak in XGDMatrixCreateFromMat_omp() This replaces the array allocated by new with a std::vector. Fixes #3161	2018-03-18 15:03:27 +13:00
Rory Mitchell	9fa45d3a9c	Fix bug with gpu_predictor caching behaviour (#3177 ) * Fixes #3162	2018-03-18 10:35:10 +13:00
Ray Kim	cdc036b752	Fixed performance bug (#3171 ) Minor performance improvements to gpu predictor	2018-03-15 09:40:24 +13:00
Rory Mitchell	7a81c87dfa	Fix incorrect minimum value in quantile generation (#3167 )	2018-03-14 08:21:18 -07:00
Vadim Khotilovich	706be4e5d4	Additional improvements for gblinear (#3134 ) * fix rebase conflict * [core] additional gblinear improvements * [R] callback for gblinear coefficients history * force eta=1 for gblinear python tests * add top_k to GreedyFeatureSelector * set eta=1 in shotgun test * [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater * set sorted flag within TryInitColData * gblinear tests: use scale, add external memory test * fix multiclass for greedy updater * fix whitespace * fix typo	2018-03-13 01:27:13 -05:00
Andrew V. Adinetz	a1b48afa41	Added back UpdatePredictionCache() in updater_gpu_hist.cu. (#3120 ) * Added back UpdatePredictionCache() in updater_gpu_hist.cu. - it had been there before, but wasn't ported to the new version of updater_gpu_hist.cu	2018-03-09 15:06:45 +13:00
redditur	d5f1b74ef5	'hist': Montonic Constraints (#3085 ) * Extended monotonic constraints support to 'hist' tree method. * Added monotonic constraints tests. * Fix the signature of NoConstraint::CalcSplitGain() * Document monotonic constraint support in 'hist' * Update signature of Update to account for latest refactor	2018-03-05 16:45:49 -08:00
Andrew V. Adinetz	d5992dd881	Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. (#3116 ) * Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. - replacement was performed in the learner, boosters, predictors, updaters, and objective functions - only interfaces used in training were replaced; interfaces like PredictInstance() still use std::vector - refactoring necessary for replacement of interfaces was also performed, such as using HostDeviceVector in prediction cache * HostDeviceVector-based interfaces for custom objective function example plugin.	2018-02-28 13:00:04 +13:00
Rory Mitchell	dd82b28e20	Update GPU code with dmatrix changes (#3117 )	2018-02-17 12:11:48 +13:00
Rory Mitchell	10eb05a63a	Refactor linear modelling and add new coordinate descent updater (#3103 ) * Refactor linear modelling and add new coordinate descent updater * Allow unsorted column iterator * Add prediction cacheing to gblinear	2018-02-17 09:17:01 +13:00
Vadim Khotilovich	9ffe8596f2	[core] fix slow predict-caching with many classes (#3109 ) * fix prediction caching inefficiency for multiclass * silence some warnings * redundant if * workaround for R v3.4.3 bug; fixes #3081	2018-02-15 18:31:42 -06:00
Abraham Zhan	874525c152	c_api.cc variable declared inapproiate (#3044 ) In line 461, the "size_t offset = 0;" should be declared before any calculation, otherwise will cause compilation error. ``` I:\Libraries\xgboost\src\c_api\c_api.cc(416): error C2146: Missing ";" before "offset" [I:\Libraries\xgboost\build\objxgboost.vcxproj] ```	2018-02-09 01:32:01 -08:00
Scott Lundberg	d878c36c84	Add SHAP interaction effects, fix minor bug, and add cox loss (#3043 ) * Add interaction effects and cox loss * Minimize whitespace changes * Cox loss now no longer needs a pre-sorted dataset. * Address code review comments * Remove mem check, rename to pred_interactions, include bias * Make lint happy * More lint fixes * Fix cox loss indexing * Fix main effects and tests * Fix lint * Use half interaction values on the off-diagonals * Fix lint again	2018-02-07 20:38:01 -06:00
Vadim Khotilovich	94e655329f	Replacing cout with LOG (#3076 ) * change cout to LOG * lint fix	2018-02-06 02:00:34 -06:00
Andrew V. Adinetz	24c2e41287	Fixed the bug with illegal memory access in test_large_sizes.py with 4 GPUs. (#3068 ) - thrust::copy() called from dvec::copy() for gpairs invoked a GPU kernel instead of cudaMemcpy() - this resulted in illegal memory access if the GPU running the kernel could not access the data being copied - new version of dvec::copy() for thrust::device_ptr iterators calls cudaMemcpy(), avoiding the problem.	2018-02-01 16:54:46 +13:00
Rory Mitchell	f87802f00c	Fix GPU bugs (#3051 ) * Change uint to unsigned int * Fix no root predictions bug * Remove redundant splitting due to numerical instability	2018-01-23 13:14:15 +13:00
Thejaswi	84ab74f3a5	Objective function evaluation on GPU with minimal PCIe transfers (#2935 ) * Added GPU objective function and no-copy interface. - xgboost::HostDeviceVector<T> syncs automatically between host and device - no-copy interfaces have been added - default implementations just sync the data to host and call the implementations with std::vector - GPU objective function, predictor, histogram updater process data directly on GPU	2018-01-12 21:33:39 +13:00
PSEUDOTENSOR / Jonathan McKinney	4d36036fe6	Avoid repeated cuda API call in GPU predictor and only synchronize used GPUs (#2936 )	2017-12-09 16:00:42 +13:00
Rory Mitchell	1b77903eeb	Fix several GPU bugs (#2916 ) * Fix #2905 * Fix gpu_exact test failures * Fix bug in GPU prediction where multiple calls to batch prediction can produce incorrect results * Fix GPU documentation formatting	2017-12-04 08:27:49 +13:00
Rory Mitchell	c51adb49b6	Monotone constraints for gpu_hist (#2904 )	2017-11-30 10:26:19 +13:00
EvanChong	790da458e7	Sync number of features after loaded matrix in different workers. (#2722 )	2017-11-29 11:19:12 -08:00
Rory Mitchell	c55f14668e	Update gpu_hist algorithm (#2901 )	2017-11-27 13:44:24 +13:00
Rory Mitchell	24f527a1c0	AVX gradients (#2878 ) * AVX gradients * Add google test for AVX * Create fallback implementation, remove fma instruction * Improved accuracy of AVX exp function	2017-11-27 08:56:01 +13:00
Rory Mitchell	40c6e2f0c8	Improved gpu_hist_experimental algorithm (#2866 ) - Implement colsampling, subsampling for gpu_hist_experimental - Optimised multi-GPU implementation for gpu_hist_experimental - Make nccl optional - Add Volta architecture flag - Optimise RegLossObj - Add timing utilities for debug verbose mode - Bump required cuda version to 8.0	2017-11-11 13:58:40 +13:00
Rory Mitchell	d9d5293cdb	Add warnings for large labels when using GPU histogram algorithms (#2834 )	2017-10-26 17:31:10 +13:00
Rory Mitchell	13e7a2cff0	Various bug fixes (#2825 ) * Fatal error if GPU algorithm selected without GPU support compiled * Resolve type conversion warnings * Fix gpu unit test failure * Fix compressed iterator edge case * Fix python unit test failures due to flake8 update on pip	2017-10-25 14:45:01 +13:00
Philip Cho	452063c32d	Fix issue #2800 (#2817 ) Problem: Fast histogram updater crashes whenever subsampling picks zero rows Diagnosis: Row set data structure uses "nullptr" internally to indicate a non-existent row set. Since you cannot take the address of the first element of an empty vector, a valid row set ends up getting "nullptr" as well. Fix: Use an arbitrary value (not equal to "nullptr") to bypass nullptr check.	2017-10-23 10:46:25 -05:00
Qiang Luo	c09ad421a8	fix bug in loading config for pred task (#2795 )	2017-10-20 00:10:14 -05:00
Scott Lundberg	78c4188cec	SHAP values for feature contributions (#2438 ) * SHAP values for feature contributions * Fix commenting error * New polynomial time SHAP value estimation algorithm * Update API to support SHAP values * Fix merge conflicts with updates in master * Correct submodule hashes * Fix variable sized stack allocation * Make lint happy * Add docs * Fix typo * Adjust tolerances * Remove unneeded def * Fixed cpp test setup * Updated R API and cleaned up * Fixed test typo	2017-10-12 12:35:51 -07:00
Rory Mitchell	4cb2f7598b	-Add experimental GPU algorithm for lossguided mode (#2755 ) -Improved GPU algorithm unit tests -Removed some thrust code to improve compile times	2017-10-01 00:18:35 +13:00
Vadim Khotilovich	74db9757b3	[R package] GPU support (#2732 ) * [R] MSVC compatibility * [GPU] allow seed in BernoulliRng up to size_t and scale to uint32_t * R package build with cmake and CUDA * R package CUDA build fixes and cleanups * always export the R package native initialization routine on windows * update the install instructions doc * fix lint * use static_cast directly to set BernoulliRng seed * [R] demo for GPU accelerated algorithm * tidy up the R package cmake stuff * R pack cmake: installs main dependency packages if needed * [R] version bump in DESCRIPTION * update NEWS * added short missing/sparse values explanations to FAQ	2017-09-28 18:15:28 -05:00

1 2 3 4 5 ...

577 Commits