xgboost

Author	SHA1	Message	Date
Rory Mitchell	78bea0d204	Add google test for a column sampling, restore metainfo tests (#3637 ) * Add google test for a column sampling, restore metainfo tests * Update metainfo test for visual studio * Fix multi-GPU bug introduced in #3635	2018-08-28 16:10:26 +12:00
trivialfis	60787ecebc	Merge generic device helper functions into gpu set. (#3626 ) * Remove the use of old NDevices* functions. * Use GPUSet in timer.h.	2018-08-26 18:14:23 +12:00
trivialfis	cf2d86a4f6	Add travis sanitizers tests. (#3557 ) * Add travis sanitizers tests. * Add gcc-7 in Travis. * Add SANITIZER_PATH for CMake. * Enable sanitizer tests in Travis. * Fix memory leaks in tests. * Fix all memory leaks reported by Address Sanitizer. * tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.	2018-08-19 16:40:30 +12:00
trivialfis	2c502784ff	Span class. (#3548 ) * Add basic Span class based on ISO++20. * Use Span<Entry const> instead of Inst in SparsePage. * Add DeviceSpan in HostDeviceVector, use it in regression obj.	2018-08-14 17:58:11 +12:00
Rory Mitchell	bbb771f32e	Refactor parts of fast histogram utilities (#3564 ) * Refactor parts of fast histogram utilities * Removed byte packing from column matrix	2018-08-09 17:59:57 +12:00
Henry Gouk	69454d9487	Implementation of hinge loss for binary classification (#3477 )	2018-08-07 10:06:42 +12:00
Andy Adinets	cc6a5a3666	Added finding quantiles on GPU. (#3393 ) * Added finding quantiles on GPU. - this includes datasets where weights are assigned to data rows - as the quantiles found by the new algorithm are not the same as those found by the old one, test thresholds in tests/python-gpu/test_gpu_updaters.py have been adjusted. * Adjustments and improved testing for finding quantiles on the GPU. - added C++ tests for the DeviceSketch() function - reduced one of the thresholds in test_gpu_updaters.py - adjusted the cuts found by the find_cuts_k kernel	2018-07-27 14:03:16 +12:00
liuliang01	0cf88d036f	Add qid like ranklib format (#2749 ) * add qid for https://github.com/dmlc/xgboost/issues/2748 * change names * change spaces * change qid to bst_uint type * change qid type to size_t * change qid first to SIZE_MAX * change qid type from size_t to uint64_t * update dmlc-core * fix qids name error * fix group_ptr_ error * Style fix * Add qid handling logic to SparsePage * New MetaInfo format + backward compatibility fix Old MetaInfo format (1.0) doesn't contain qid field. We still want to be able to read from MetaInfo files saved in old format. Also, define a new format (2.0) that contains the qid field. This way, we can distinguish files that contain qid and those that do not. * Update MetaInfo test * Simply group assignment logic * Explicitly set qid=nullptr in NativeDataIter NativeDataIter's callback does not support qid field. Users of NativeDataIter will need to call setGroup() function separately to set group information. * Save qids_ in SaveBinary() * Upgrade dmlc-core submodule * Add a test for reading qid * Add contributor * Check the size of qids_ * Document qid format	2018-06-30 20:24:03 +00:00
pdesahb	12e34f32e2	Fix tweedie handling of base_score (#3295 ) * fix tweedie margin calculations * add entry to contributors	2018-06-28 15:43:05 +00:00
ngoyal2707	5cd851ccef	added code for instance based weighing for rank objectives (#3379 ) * added code for instance based weighing for rank objectives * Fix lint	2018-06-22 15:10:59 -07:00
PSEUDOTENSOR / Jonathan McKinney	9ac163d0bb	Allow import via python datatable. (#3272 ) * Allow import via python datatable. * Write unit tests * Refactor dt API functions * Refactor python code * Lint fixes * Address review comments	2018-06-20 13:16:18 -07:00
Rory Mitchell	a96039141a	Dmatrix refactor stage 1 (#3301 ) * Use sparse page as singular CSR matrix representation * Simplify dmatrix methods * Reduce statefullness of batch iterators * BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.	2018-06-07 10:25:58 +12:00
Andy Adinets	286dccb8e8	GPU binning and compression. (#3319 ) * GPU binning and compression. - binning and index compression are done inside the DeviceShard constructor - in case of a DMatrix with multiple row batches, it is first converted into a single row batch	2018-06-05 17:15:13 +12:00
trivialfis	34aeee2961	Fix test_param.cc header path (#3317 )	2018-05-28 10:26:29 -07:00
Rory Mitchell	3ee725e3bb	Add cuda forwards compatibility (#3316 )	2018-05-17 10:59:22 +12:00
Rory Mitchell	088bb4b27c	Prevent multiclass Hessian approaching 0 (#3304 ) * Prevent Hessian in multiclass objective becoming zero * Set default learning rate to 0.5 for "coord_descent" linear updater	2018-05-09 20:25:51 +12:00
Rory Mitchell	a185ddfe03	Implement GPU accelerated coordinate descent algorithm (#3178 ) * Implement GPU accelerated coordinate descent algorithm. * Exclude external memory tests for GPU	2018-04-20 14:56:35 +12:00
Rory Mitchell	ccf80703ef	Clang-tidy static analysis (#3222 ) * Clang-tidy static analysis * Modernise checks * Google coding standard checks * Identifier renaming according to Google style	2018-04-19 18:57:13 +12:00
Rory Mitchell	a1ec7b1716	Change reduce operation from thrust to cub. Fix for cuda 9.1 error (#3218 ) * Change reduce operation from thrust to cub. Fix for cuda 9.1 runtime error * Unit test sum reduce	2018-04-04 14:21:48 +12:00
Arjan van der Velde	04221a7469	rank_metric: add AUC-PR (#3172 ) * rank_metric: add AUC-PR Implementation of the AUC-PR calculation for weighted data, proposed by Keilwagen, Grosse and Grau (https://doi.org/10.1371/journal.pone.0092209) * rank_metric: fix lint warnings * Implement tests for AUC-PR and fix implementation * add aucpr to documentation for other languages	2018-03-23 10:43:47 -04:00
Vadim Khotilovich	706be4e5d4	Additional improvements for gblinear (#3134 ) * fix rebase conflict * [core] additional gblinear improvements * [R] callback for gblinear coefficients history * force eta=1 for gblinear python tests * add top_k to GreedyFeatureSelector * set eta=1 in shotgun test * [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater * set sorted flag within TryInitColData * gblinear tests: use scale, add external memory test * fix multiclass for greedy updater * fix whitespace * fix typo	2018-03-13 01:27:13 -05:00
Andrew V. Adinetz	d5992dd881	Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. (#3116 ) * Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. - replacement was performed in the learner, boosters, predictors, updaters, and objective functions - only interfaces used in training were replaced; interfaces like PredictInstance() still use std::vector - refactoring necessary for replacement of interfaces was also performed, such as using HostDeviceVector in prediction cache * HostDeviceVector-based interfaces for custom objective function example plugin.	2018-02-28 13:00:04 +13:00
Rory Mitchell	10eb05a63a	Refactor linear modelling and add new coordinate descent updater (#3103 ) * Refactor linear modelling and add new coordinate descent updater * Allow unsorted column iterator * Add prediction cacheing to gblinear	2018-02-17 09:17:01 +13:00
Scott Lundberg	d878c36c84	Add SHAP interaction effects, fix minor bug, and add cox loss (#3043 ) * Add interaction effects and cox loss * Minimize whitespace changes * Cox loss now no longer needs a pre-sorted dataset. * Address code review comments * Remove mem check, rename to pred_interactions, include bias * Make lint happy * More lint fixes * Fix cox loss indexing * Fix main effects and tests * Fix lint * Use half interaction values on the off-diagonals * Fix lint again	2018-02-07 20:38:01 -06:00
Thejaswi	84ab74f3a5	Objective function evaluation on GPU with minimal PCIe transfers (#2935 ) * Added GPU objective function and no-copy interface. - xgboost::HostDeviceVector<T> syncs automatically between host and device - no-copy interfaces have been added - default implementations just sync the data to host and call the implementations with std::vector - GPU objective function, predictor, histogram updater process data directly on GPU	2018-01-12 21:33:39 +13:00
Rory Mitchell	7759ab99ee	Fix Google test warnings and error (#2957 )	2017-12-20 00:13:56 +13:00
Rory Mitchell	c55f14668e	Update gpu_hist algorithm (#2901 )	2017-11-27 13:44:24 +13:00
Rory Mitchell	40c6e2f0c8	Improved gpu_hist_experimental algorithm (#2866 ) - Implement colsampling, subsampling for gpu_hist_experimental - Optimised multi-GPU implementation for gpu_hist_experimental - Make nccl optional - Add Volta architecture flag - Optimise RegLossObj - Add timing utilities for debug verbose mode - Bump required cuda version to 8.0	2017-11-11 13:58:40 +13:00
Rory Mitchell	13e7a2cff0	Various bug fixes (#2825 ) * Fatal error if GPU algorithm selected without GPU support compiled * Resolve type conversion warnings * Fix gpu unit test failure * Fix compressed iterator edge case * Fix python unit test failures due to flake8 update on pip	2017-10-25 14:45:01 +13:00
Scott Lundberg	78c4188cec	SHAP values for feature contributions (#2438 ) * SHAP values for feature contributions * Fix commenting error * New polynomial time SHAP value estimation algorithm * Update API to support SHAP values * Fix merge conflicts with updates in master * Correct submodule hashes * Fix variable sized stack allocation * Make lint happy * Add docs * Fix typo * Adjust tolerances * Remove unneeded def * Fixed cpp test setup * Updated R API and cleaned up * Fixed test typo	2017-10-12 12:35:51 -07:00
Rory Mitchell	4cb2f7598b	-Add experimental GPU algorithm for lossguided mode (#2755 ) -Improved GPU algorithm unit tests -Removed some thrust code to improve compile times	2017-10-01 00:18:35 +13:00
Rory Mitchell	e6a9063344	Integer gradient summation for GPU histogram algorithm. (#2681 )	2017-09-08 15:07:29 +12:00
Rory Mitchell	15267eedf2	[GPU-Plugin] Major refactor 2 (#2664 ) * Change cmake option * Move source files * Move google tests * Move python tests * Move benchmarks * Move documentation * Remove makefile support * Fix test run * Move GPU tests	2017-09-08 09:57:16 +12:00
Rory Mitchell	ef23e424f1	[GPU-Plugin] Add GPU accelerated prediction (#2593 ) * [GPU-Plugin] Add GPU accelerated prediction * Improve allocation message * Update documentation * Resolve linker error for predictor * Add unit tests	2017-08-16 12:31:59 +12:00
PSEUDOTENSOR / Jonathan McKinney	6b375f6ad8	Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530 ) * Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.	2017-07-21 14:43:17 +12:00
PSEUDOTENSOR / Jonathan McKinney	ca7fc9fda3	[GPU-Plugin] Fix gpu_hist to allow matrices with more than just 2^{32} elements. Also fixed CPU hist algorithm. (#2518 )	2017-07-18 11:19:27 +12:00
Rory Mitchell	530f01e21c	[GPU-Plugin] Add load balancing search to gpu_hist. Add compressed iterator. (#2504 )	2017-07-11 22:36:39 +12:00
Rory Mitchell	e939192978	Cmake improvements (#2487 ) * Cmake improvements * Add google test to cmake	2017-07-06 18:05:11 +12:00
Thejaswi	85b2fb3eee	[GPU-Plugin] Integration of a faster version of grow_gpu plugin into mainstream (#2360 ) * Integrating a faster version of grow_gpu plugin 1. Removed the older files to reduce duplication 2. Moved all of the grow_gpu files under 'exact' folder 3. All of them are inside 'exact' namespace to avoid any conflicts 4. Fixed a bug in benchmark.py while running only 'grow_gpu' plugin 5. Added cub and googletest submodules to ease integration and unit-testing 6. Updates to CMakeLists.txt to directly build cuda objects into libxgboost * Added support for building gpu plugins through make flow 1. updated makefile and config.mk to add right targets 2. added unit-tests for gpu exact plugin code * 1. Added support for building gpu plugin using 'make' flow as well 2. Updated instructions for building and testing gpu plugin * Fix travis-ci errors for PR#2360 1. lint errors on unit-tests 2. removed googletest, instead depended upon dmlc-core provide gtest cache * Some more fixes to travis-ci lint failures PR#2360 * Added Rory's copyrights to the files containing code from both. * updated copyright statement as per Rory's request * moved the static datasets into a script to generate them at runtime * 1. memory usage print when silent=0 2. tests/ and test/ folder organization 3. removal of the dependency of googletest for just building xgboost 4. coding style updates for .cuh as well * Fixes for compilation warnings * add cuda object files as well when JVM_BINDINGS=ON	2017-06-06 09:39:53 +12:00
AbdealiJK	47ba2de7d4	tests/cpp: Add tests for multiclass_metric.cc	2016-12-04 11:25:57 -08:00
AbdealiJK	a7e20555a3	tests/cpp: Add tests for rank_metrics.cc	2016-12-04 11:25:57 -08:00
AbdealiJK	4a2ef130a7	tests/cpp: Add test for elementwise_metric.cc	2016-12-04 11:25:57 -08:00
AbdealiJK	03abd47f49	tests/cpp: Add tests for Metric RMSE	2016-12-04 11:25:57 -08:00
AbdealiJK	582c373274	tests/cpp: Add tests for metric.cc	2016-12-04 11:25:57 -08:00
AbdealiJK	cc859420ba	tests/cpp: Add tests for TweedieRegression	2016-12-04 11:25:57 -08:00
AbdealiJK	fa865564f6	tests/cpp: Add tests for GammaRegression	2016-12-04 11:25:57 -08:00
AbdealiJK	401e4b5220	tests/cpp: Add tests for PoissonRegression	2016-12-04 11:25:57 -08:00
AbdealiJK	d41aab4f61	tests/cpp: Add tests for regression_obj.cc Test the objective functions in regression_obj.cc tests/cpp: Add tests for objective.cc and RegLossObj	2016-12-04 11:25:57 -08:00
AbdealiJK	fd99d39372	tests/cpp: Add tests for SplitEntry	2016-12-04 11:25:57 -08:00
AbdealiJK	62e3468603	tests/cpp: Add tests for param.h	2016-12-04 11:25:57 -08:00

1 2

56 Commits