xgboost

Author	SHA1	Message	Date
Rory Mitchell	a96039141a	Dmatrix refactor stage 1 (#3301 ) * Use sparse page as singular CSR matrix representation * Simplify dmatrix methods * Reduce statefullness of batch iterators * BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.	2018-06-07 10:25:58 +12:00
Andy Adinets	286dccb8e8	GPU binning and compression. (#3319 ) * GPU binning and compression. - binning and index compression are done inside the DeviceShard constructor - in case of a DMatrix with multiple row batches, it is first converted into a single row batch	2018-06-05 17:15:13 +12:00
trivialfis	34aeee2961	Fix test_param.cc header path (#3317 )	2018-05-28 10:26:29 -07:00
pdavalo	480e3fd764	Sklearn: validation set weights (#2354 ) * Add option to use weights when evaluating metrics in validation sets * Add test for validation-set weights functionality * simplify case with no weights for test sets * fix lint issues	2018-05-23 17:06:20 -07:00
Rory Mitchell	3ee725e3bb	Add cuda forwards compatibility (#3316 )	2018-05-17 10:59:22 +12:00
Rory Mitchell	f8b7686719	Add cuda 8/9.1 centos 6 builds, test GPU wheel on CPU only container. (#3309 ) * Add cuda 8/9.1 centos 6 builds, test GPU wheel on CPU only container. * Add Google test	2018-05-17 10:57:01 +12:00
Philip Hyunsu Cho	9a8211f668	Update dmlc-core submodule (#3221 ) * Update dmlc-core submodule * Fix dense_parser to work with the latest dmlc-core * Specify location of Google Test * Add more source files in dmlc-minimum to get latest dmlc-core working * Update dmlc-core submodule	2018-05-09 18:55:29 -07:00
Rory Mitchell	088bb4b27c	Prevent multiclass Hessian approaching 0 (#3304 ) * Prevent Hessian in multiclass objective becoming zero * Set default learning rate to 0.5 for "coord_descent" linear updater	2018-05-09 20:25:51 +12:00
Rory Mitchell	90a5c4db9d	Update Jenkins CI for GPU (#3294 )	2018-05-04 16:50:59 +12:00
Rory Mitchell	a185ddfe03	Implement GPU accelerated coordinate descent algorithm (#3178 ) * Implement GPU accelerated coordinate descent algorithm. * Exclude external memory tests for GPU	2018-04-20 14:56:35 +12:00
Rory Mitchell	ccf80703ef	Clang-tidy static analysis (#3222 ) * Clang-tidy static analysis * Modernise checks * Google coding standard checks * Identifier renaming according to Google style	2018-04-19 18:57:13 +12:00
Rory Mitchell	443ff746e9	Fix logic in GPU predictor cache lookup (#3217 ) * Fix logic in GPU predictor cache lookup * Add sklearn test for GPU prediction	2018-04-04 15:08:22 +12:00
Rory Mitchell	a1ec7b1716	Change reduce operation from thrust to cub. Fix for cuda 9.1 error (#3218 ) * Change reduce operation from thrust to cub. Fix for cuda 9.1 runtime error * Unit test sum reduce	2018-04-04 14:21:48 +12:00
Arjan van der Velde	04221a7469	rank_metric: add AUC-PR (#3172 ) * rank_metric: add AUC-PR Implementation of the AUC-PR calculation for weighted data, proposed by Keilwagen, Grosse and Grau (https://doi.org/10.1371/journal.pone.0092209) * rank_metric: fix lint warnings * Implement tests for AUC-PR and fix implementation * add aucpr to documentation for other languages	2018-03-23 10:43:47 -04:00
Rory Mitchell	9fa45d3a9c	Fix bug with gpu_predictor caching behaviour (#3177 ) * Fixes #3162	2018-03-18 10:35:10 +13:00
Vadim Khotilovich	706be4e5d4	Additional improvements for gblinear (#3134 ) * fix rebase conflict * [core] additional gblinear improvements * [R] callback for gblinear coefficients history * force eta=1 for gblinear python tests * add top_k to GreedyFeatureSelector * set eta=1 in shotgun test * [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater * set sorted flag within TryInitColData * gblinear tests: use scale, add external memory test * fix multiclass for greedy updater * fix whitespace * fix typo	2018-03-13 01:27:13 -05:00
redditur	d5f1b74ef5	'hist': Montonic Constraints (#3085 ) * Extended monotonic constraints support to 'hist' tree method. * Added monotonic constraints tests. * Fix the signature of NoConstraint::CalcSplitGain() * Document monotonic constraint support in 'hist' * Update signature of Update to account for latest refactor	2018-03-05 16:45:49 -08:00
Andrew V. Adinetz	d5992dd881	Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. (#3116 ) * Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. - replacement was performed in the learner, boosters, predictors, updaters, and objective functions - only interfaces used in training were replaced; interfaces like PredictInstance() still use std::vector - refactoring necessary for replacement of interfaces was also performed, such as using HostDeviceVector in prediction cache * HostDeviceVector-based interfaces for custom objective function example plugin.	2018-02-28 13:00:04 +13:00
Yuan (Terry) Tang	11bfa8584d	Remove unnecessary dependencies in distributed test (#3132 )	2018-02-24 20:24:34 -05:00
Rory Mitchell	10eb05a63a	Refactor linear modelling and add new coordinate descent updater (#3103 ) * Refactor linear modelling and add new coordinate descent updater * Allow unsorted column iterator * Add prediction cacheing to gblinear	2018-02-17 09:17:01 +13:00
Scott Lundberg	d878c36c84	Add SHAP interaction effects, fix minor bug, and add cox loss (#3043 ) * Add interaction effects and cox loss * Minimize whitespace changes * Cox loss now no longer needs a pre-sorted dataset. * Address code review comments * Remove mem check, rename to pred_interactions, include bias * Make lint happy * More lint fixes * Fix cox loss indexing * Fix main effects and tests * Fix lint * Use half interaction values on the off-diagonals * Fix lint again	2018-02-07 20:38:01 -06:00
Thejaswi	84ab74f3a5	Objective function evaluation on GPU with minimal PCIe transfers (#2935 ) * Added GPU objective function and no-copy interface. - xgboost::HostDeviceVector<T> syncs automatically between host and device - no-copy interfaces have been added - default implementations just sync the data to host and call the implementations with std::vector - GPU objective function, predictor, histogram updater process data directly on GPU	2018-01-12 21:33:39 +13:00
Rory Mitchell	7759ab99ee	Fix Google test warnings and error (#2957 )	2017-12-20 00:13:56 +13:00
Rory Mitchell	1b77903eeb	Fix several GPU bugs (#2916 ) * Fix #2905 * Fix gpu_exact test failures * Fix bug in GPU prediction where multiple calls to batch prediction can produce incorrect results * Fix GPU documentation formatting	2017-12-04 08:27:49 +13:00
Rory Mitchell	c51adb49b6	Monotone constraints for gpu_hist (#2904 )	2017-11-30 10:26:19 +13:00
Rory Mitchell	c55f14668e	Update gpu_hist algorithm (#2901 )	2017-11-27 13:44:24 +13:00
Rory Mitchell	24f527a1c0	AVX gradients (#2878 ) * AVX gradients * Add google test for AVX * Create fallback implementation, remove fma instruction * Improved accuracy of AVX exp function	2017-11-27 08:56:01 +13:00
Rory Mitchell	40c6e2f0c8	Improved gpu_hist_experimental algorithm (#2866 ) - Implement colsampling, subsampling for gpu_hist_experimental - Optimised multi-GPU implementation for gpu_hist_experimental - Make nccl optional - Add Volta architecture flag - Optimise RegLossObj - Add timing utilities for debug verbose mode - Bump required cuda version to 8.0	2017-11-11 13:58:40 +13:00
Rory Mitchell	13e7a2cff0	Various bug fixes (#2825 ) * Fatal error if GPU algorithm selected without GPU support compiled * Resolve type conversion warnings * Fix gpu unit test failure * Fix compressed iterator edge case * Fix python unit test failures due to flake8 update on pip	2017-10-25 14:45:01 +13:00
Scott Lundberg	78c4188cec	SHAP values for feature contributions (#2438 ) * SHAP values for feature contributions * Fix commenting error * New polynomial time SHAP value estimation algorithm * Update API to support SHAP values * Fix merge conflicts with updates in master * Correct submodule hashes * Fix variable sized stack allocation * Make lint happy * Add docs * Fix typo * Adjust tolerances * Remove unneeded def * Fixed cpp test setup * Updated R API and cleaned up * Fixed test typo	2017-10-12 12:35:51 -07:00
Rory Mitchell	4cb2f7598b	-Add experimental GPU algorithm for lossguided mode (#2755 ) -Improved GPU algorithm unit tests -Removed some thrust code to improve compile times	2017-10-01 00:18:35 +13:00
Tsukasa OMOTO	8d15024ac7	python: follow the default warning filters of Python (#2666 ) * python: follow the default warning filters of Python https://docs.python.org/3/library/warnings.html#default-warning-filters * update tests * update tests	2017-09-27 03:03:01 -04:00
Icyblade Dai	0e85b30fdd	Fix issue 2670 (#2671 ) * fix issue 2670 * add python<3.6 compatibility * fix Index * fix Index/MultiIndex * fix lint * fix W0622 really nonsense * fix lambda * Trigger Travis * add test for MultiIndex * remove tailing whitespace	2017-09-19 15:49:41 -04:00
Rory Mitchell	9c85903f0b	Add GPU documentation (#2695 ) * Add GPU documentation * Update Python GPU tests	2017-09-10 19:42:46 +12:00
Rory Mitchell	e6a9063344	Integer gradient summation for GPU histogram algorithm. (#2681 )	2017-09-08 15:07:29 +12:00
Rory Mitchell	15267eedf2	[GPU-Plugin] Major refactor 2 (#2664 ) * Change cmake option * Move source files * Move google tests * Move python tests * Move benchmarks * Move documentation * Remove makefile support * Fix test run * Move GPU tests	2017-09-08 09:57:16 +12:00
Yun Ni	f04bde05fd	Add Coverage Report for Java and Python (#2667 ) * Add coverage report for java * Add coverage report for python * Increase memory for JVM unit tests * Increase memory for JVM unit tests	2017-09-05 14:46:51 -07:00
Rory Mitchell	ef23e424f1	[GPU-Plugin] Add GPU accelerated prediction (#2593 ) * [GPU-Plugin] Add GPU accelerated prediction * Improve allocation message * Update documentation * Resolve linker error for predictor * Add unit tests	2017-08-16 12:31:59 +12:00
PSEUDOTENSOR / Jonathan McKinney	6b375f6ad8	Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530 ) * Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.	2017-07-21 14:43:17 +12:00
PSEUDOTENSOR / Jonathan McKinney	ca7fc9fda3	[GPU-Plugin] Fix gpu_hist to allow matrices with more than just 2^{32} elements. Also fixed CPU hist algorithm. (#2518 )	2017-07-18 11:19:27 +12:00
Michal Malohlava	33ee7d1615	[BUILD] Dockerfile and Jenkinsfile revisited (#2514 ) Includes: - Dockerfile changes - Dockerfile clean up - Fix execution privileges of files used from Dockerfile. - New Dockerfile entrypoint to replace with_user script - Defined a placeholders for CPU testing (script and Dockerfile) - Jenkinsfile - Jenkins file milestone defined - Single source code checkout and propagation via stash/unstash - Bash needs to be explicitly used in launching make build, since we need access to environment - Jenkinsfile build factory for cmake and make style of jobs - Archivation of artifacts (.so, .whl, *.egg) produced by cmake build Missing: - CPU testing - Python3 env build and testing	2017-07-13 17:51:47 +12:00
Rory Mitchell	530f01e21c	[GPU-Plugin] Add load balancing search to gpu_hist. Add compressed iterator. (#2504 )	2017-07-11 22:36:39 +12:00
Rory Mitchell	e939192978	Cmake improvements (#2487 ) * Cmake improvements * Add google test to cmake	2017-07-06 18:05:11 +12:00
Rory Mitchell	1899f9e744	[GPU-Plugin] Add basic continuous integration for GPU plugin. (#2431 )	2017-06-22 10:15:28 -04:00
Sergei Lebedev	2cb51f7097	[jvm-packages] Another pack of build/CI improvements (#2422 ) * [jvm-packages] Fixed compilation on Windows * [jvm-packages] Build the JNI bindings on Appveyor * [jvm-packages] Build & test on OS X * [jvm-packages] Re-applied the CMake build changes reverted by #2395 * Fixed Appveyor JVM build * Muted Maven on Travis * Don't link with libawt * "linux2"->"linux" Python2.x and 3.X use slightly different values for ``sys.platform``.	2017-06-21 12:28:35 -07:00
wxchan	65d2513714	[python-package] fix sklearn n_jobs/nthreads and seed/random_state bug (#2378 ) * add a testcase causing RuntimeError * move seed/random_state/nthread/n_jobs check to get_xgb_params() * fix failed test	2017-06-12 09:33:42 -04:00
Thejaswi	85b2fb3eee	[GPU-Plugin] Integration of a faster version of grow_gpu plugin into mainstream (#2360 ) * Integrating a faster version of grow_gpu plugin 1. Removed the older files to reduce duplication 2. Moved all of the grow_gpu files under 'exact' folder 3. All of them are inside 'exact' namespace to avoid any conflicts 4. Fixed a bug in benchmark.py while running only 'grow_gpu' plugin 5. Added cub and googletest submodules to ease integration and unit-testing 6. Updates to CMakeLists.txt to directly build cuda objects into libxgboost * Added support for building gpu plugins through make flow 1. updated makefile and config.mk to add right targets 2. added unit-tests for gpu exact plugin code * 1. Added support for building gpu plugin using 'make' flow as well 2. Updated instructions for building and testing gpu plugin * Fix travis-ci errors for PR#2360 1. lint errors on unit-tests 2. removed googletest, instead depended upon dmlc-core provide gtest cache * Some more fixes to travis-ci lint failures PR#2360 * Added Rory's copyrights to the files containing code from both. * updated copyright statement as per Rory's request * moved the static datasets into a script to generate them at runtime * 1. memory usage print when silent=0 2. tests/ and test/ folder organization 3. removal of the dependency of googletest for just building xgboost 4. coding style updates for .cuh as well * Fixes for compilation warnings * add cuda object files as well when JVM_BINDINGS=ON	2017-06-06 09:39:53 +12:00
gaw89	0f3a404d91	Sklearn kwargs (#2338 ) * Added kwargs support for Sklearn API * Updated NEWS and CONTRIBUTORS * Fixed CONTRIBUTORS.md * Added clarification of *kwargs and test for proper usage Fixed lint error * Fixed more lint errors and clf assigned but never used * Fixed more lint errors * Fixed more lint errors * Fixed issue with changes from different branch bleeding over * Fixed issue with changes from other branch bleeding over * Added note that kwargs may not be compatible with Sklearn * Fixed linting on kwargs note	2017-05-23 21:47:53 -05:00
gaw89	6cea1e3fb7	Sklearn convention update (#2323 ) * Added n_jobs and random_state to keep up to date with sklearn API. Deprecated nthread and seed. Added tests for new params and deprecations. * Fixed docstring to reflect updates to n_jobs and random_state. * Fixed whitespace issues and removed nose import. * Added deprecation note for nthread and seed in docstring. * Attempted fix of deprecation tests. * Second attempted fix to tests. * Set n_jobs to 1.	2017-05-22 08:22:05 -05:00
jayzed82	29289d2302	Add option to choose booster in scikit intreface (gbtree by default) (#2303 ) * Add option to choose booster in scikit intreface (gbtree by default) * Add option to choose booster in scikit intreface: complete docstring. * Fix XGBClassifier to work with booster option * Added test case for gblinear booster	2017-05-18 23:12:27 -04:00

1 2 3 4

164 Commits