xgboost

Author	SHA1	Message	Date
Rory Mitchell	f75a21af25	Reduce tree expand boilerplate code (#4008 )	2018-12-20 15:52:28 +13:00
Rory Mitchell	84c99f86f4	Combine TreeModel and RegTree (#3995 )	2018-12-19 12:16:40 +13:00
Jiaming Yuan	c8c7b9649c	Fix and optimize logger (#4002 ) * Fix logging switch statement. * Remove debug_verbose_ in AllReducer. * Don't construct the stream when not needed. * Make default constructor deleted. * Remove redundant IsVerbose.	2018-12-17 19:23:05 +08:00
Andy Adinets	42bf90eb8f	Column sampling at individual nodes (splits). (#3971 ) * Column sampling at individual nodes (splits). * Documented colsample_bynode parameter. - also updated documentation for colsample_by* parameters * Updated documentation. * GetFeatureSet() returns shared pointer to std::vector. * Sync sampled columns across multiple processes.	2018-12-14 22:37:35 +08:00
Jiaming Yuan	e0a279114e	Unify logging facilities. (#3982 ) * Unify logging facilities. * Enhance `ConsoleLogger` to handle different verbosity. * Override macros from `dmlc`. * Don't use specialized gamma when building with GPU. * Remove verbosity cache in monitor. * Test monitor. * Deprecate `silent`. * Fix doc and messages. * Fix python test. * Fix silent tests.	2018-12-14 19:29:58 +08:00
Rory Mitchell	3d81c48d3f	Remove leaf vector, add tree serialisation test, fix Windows tests (#3989 )	2018-12-13 10:28:38 +13:00
Rory Mitchell	93f9ce9ef9	Single precision histograms on GPU (#3965 ) * Allow single precision histogram summation in gpu_hist * Add python test, reduce run-time of gpu_hist tests * Update documentation	2018-12-10 10:55:30 +13:00
Jiaming Yuan	48dddfd635	Porting elementwise metrics to GPU. (#3952 ) * Port elementwise metrics to GPU. * All elementwise metrics are converted to static polymorphic. * Create a reducer for metrics reduction. * Remove const of Metric::Eval to accommodate CubMemory.	2018-12-01 18:46:45 +13:00
Rory Mitchell	a9d684db18	GPU performance logging/improvements (#3945 ) - Improved GPU performance logging - Only use one execute shards function - Revert performance regression on multi-GPU - Use threads to launch NCCL AllReduce	2018-11-29 14:36:51 +13:00
Jiaming Yuan	fe999bf968	Add back python2 tests for Travis light weight tests. (#3901 )	2018-11-15 22:17:35 +13:00
Jiaming Yuan	2ea0f887c1	Refactor Python tests. (#3897 ) * Deprecate nose tests. * Format python tests.	2018-11-15 13:56:33 +13:00
Rory Mitchell	7af0946ac1	Improve update position function for gpu_hist (#3895 )	2018-11-14 19:33:29 +13:00
Dr. Kashif Rasul	143475b27b	use gain for sklearn feature_importances_ (#3876 ) * use gain for sklearn feature_importances_ `gain` is a better feature importance criteria than the currently used `weight` * added importance_type to class * fixed test * white space * fix variable name * fix deprecation warning * fix exp array * white spaces	2018-11-13 03:30:40 -08:00
Rory Mitchell	926eb651fe	Minor refactor of split evaluation in gpu_hist (#3889 ) * Refactor evaluate split into shard * Use span in evaluate split * Update google tests	2018-11-14 00:11:20 +13:00
Jiaming Yuan	97984f4890	Fix gpu coordinate running on multi-gpu. (#3893 )	2018-11-13 19:09:55 +13:00
Philip Hyunsu Cho	ad6e0d55f1	Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV (#3873 ) * Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV * Fix lint * Fix lint	2018-11-08 19:41:35 -08:00
Jiaming Yuan	19ee0a3579	Refactor fast-hist, add tests for some updaters. (#3836 ) Add unittest for prune. Add unittest for refresh. Refactor fast_hist. * Remove fast_hist_param. * Rename to quantile_hist. Add unittests for QuantileHist. * Refactor QuantileHist into .h and .cc file. * Remove sync.h. * Remove MGPU_mock test. Rename fast hist method to quantile hist.	2018-11-07 21:15:07 +13:00
Philip Hyunsu Cho	2b045aa805	Make C++ unit tests run and pass on Windows (#3869 ) * Make C++ unit tests run and pass on Windows * Fix logic for external memory. The letter ':' is part of drive letter, so remove the drive letter before splitting on ':'. * Cosmetic syntax changes to keep MSVC happy. * Fix lint * Add Windows guard	2018-11-06 17:17:24 -08:00
Philip Hyunsu Cho	20d5abf919	Disallow std::regex since it's not supported by GCC 4.8.x (#3870 )	2018-11-05 22:57:04 -08:00
Jiaming Yuan	f1275f52c1	Fix specifying gpu_id, add tests. (#3851 ) * Rewrite gpu_id related code. * Remove normalised/unnormalised operatios. * Address difference between `Index' and `Device ID'. * Modify doc for `gpu_id'. * Better LOG for GPUSet. * Check specified n_gpus. * Remove inappropriate `device_idx' term. * Clarify GpuIdType and size_t.	2018-11-06 18:17:53 +13:00
Philip Hyunsu Cho	91537e7353	Fix #3342 and h2oai/h2o4gpu#625 : Save predictor parameters in model file (#3856 ) * Fix #3342 and h2oai/h2o4gpu#625: Save predictor parameters in model file This allows pickled models to retain predictor attributes, such as 'predictor' (whether to use CPU or GPU) and 'n_gpu' (number of GPUs to use). Related: h2oai/h2o4gpu#625 Closes #3342. TODO. Write a test. * Fix lint * Do not load GPU predictor into CPU-only XGBoost * Add a test for pickling GPU predictors * Make sample data big enough to pass multi GPU test * Update test_gpu_predictor.cu	2018-11-03 21:45:38 -07:00
Philip Hyunsu Cho	ad68865d6b	[Blocking] Fix #3840 : Clean up logic for parsing tree_method parameter (#3849 ) * Clean up logic for converting tree_method to updater sequence * Use C++11 enum class for extra safety Compiler will give warnings if switch statements don't handle all possible values of C++11 enum class. Also allow enum class to be used as DMLC parameter. * Fix compiler error + lint * Address reviewer comment * Better docstring for DECLARE_FIELD_ENUM_CLASS * Fix lint * Add C++ test to see if tree_method is recognized * Fix clang-tidy error * Add test_learner.h to R package * Update comments * Fix lint error	2018-11-01 19:33:35 -07:00
Philip Hyunsu Cho	411df9f878	Test wheels on CUDA 10.0 container for compatibility (#3838 )	2018-11-01 08:34:47 -07:00
Andy Adinets	2a59ff2f9b	Multi-GPU support in GPUPredictor. (#3738 ) * Multi-GPU support in GPUPredictor. - GPUPredictor is multi-GPU - removed DeviceMatrix, as it has been made obsolete by using HostDeviceVector in DMatrix * Replaced pointers with spans in GPUPredictor. * Added a multi-GPU predictor test. * Fix multi-gpu test. * Fix n_rows < n_gpus. * Reinitialize shards when GPUSet is changed. * Tests range of data. * Remove commented code. * Remove commented code.	2018-10-23 22:59:11 -07:00
Philip Hyunsu Cho	abf2f661be	Fix #3708 : Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way (#3783 ) * Fix #3708: Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way Also install git inside NVIDIA GPU container * Update dmlc-core	2018-10-18 10:16:04 -07:00
Philip Hyunsu Cho	55ee9a92a1	Fix Python environment for distributed unit tests (#3806 )	2018-10-18 00:12:02 -07:00
Rory Mitchell	f00fd87b36	Address #2754 , accuracy issues with gpu_hist (#3793 ) * Address windows compilation error * Do not allow divide by zero in weight calculation * Update tests	2018-10-15 17:50:31 +13:00
trivialfis	516457fadc	Add basic unittests for gpu-hist method. (#3785 ) * Split building histogram into separated class. * Extract `InitCompressedRow` definition. * Basic tests for gpu-hist. * Document the code more verbosely. * Removed `HistCutUnit`. * Removed some duplicated copies in `GPUHistMaker`. * Implement LCG and use it in tests.	2018-10-15 15:47:00 +13:00
Rory Mitchell	5d6baed998	Allow sklearn grid search over parameters specified as kwargs (#3791 )	2018-10-14 12:44:53 +13:00
Philip Hyunsu Cho	10cd7c8447	Fix #3714 : preserve feature names when slicing DMatrix (#3766 ) * Fix #3714: preserve feature names when slicing DMatrix * Add test	2018-10-08 01:04:33 -07:00
Rory Mitchell	34522d56f0	Allow plug-ins to be built by cmake (#3752 ) * Remove references to AVX code. * Allow plugins to be built by cmake	2018-10-04 22:03:52 +13:00
trivialfis	d594b11f35	Implement transform to reduce CPU/GPU code duplication. (#3643 ) * Implement Transform class. * Add tests for softmax. * Use Transform in regression, softmax and hinge objectives, except for Cox. * Mark old gpu objective functions deprecated. * static_assert for softmax. * Split up multi-gpu tests.	2018-10-02 15:06:21 +13:00
Rory Mitchell	70d208d68c	Dmatrix refactor stage 2 (#3395 ) * DMatrix refactor 2 * Remove buffered rowset usage where possible * Transition to c++11 style iterators for row access * Transition column iterators to C++ 11	2018-10-01 01:29:03 +13:00
Philip Hyunsu Cho	b50bc2c1d4	Add multi-GPU unit test environment (#3741 ) * Add multi-GPU unit test environment * Better assertion message * Temporarily disable failing test * Distinguish between multi-GPU and single-GPU CPP tests * Consolidate Python tests. Use attributes to distinguish multi-GPU Python tests from single-CPU counterparts	2018-09-29 11:20:58 -07:00
Philip Hyunsu Cho	baef5741df	Separate out restricted and unrestricted tasks (#3736 )	2018-09-27 23:06:14 -07:00
trivialfis	5a7f7e7d49	Implement devices to devices reshard. (#3721 ) * Force clearing device memory before Reshard. * Remove calculating row_segments for gpu_hist and gpu_sketch. * Guard against changing device.	2018-09-28 17:40:23 +12:00
Philip Hyunsu Cho	51478a39c9	Fix #3730 : scikit-learn 0.20 compatibility fix (#3731 ) * Fix #3730: scikit-learn 0.20 compatibility fix sklearn.cross_validation has been removed from scikit-learn 0.20, so replace it with sklearn.model_selection * Display test names for Python tests for clarity	2018-09-27 15:03:05 -07:00
trivialfis	9119f9e369	Fix gpu devices. (#3693 ) * Fix gpu_set normalized and unnormalized. * Fix DeviceSpan.	2018-09-19 17:39:42 +12:00
Andrew Thia	9254c58e4d	[TREE] add interaction constraints (#3466 ) * add interaction constraints * enable both interaction and monotonic constraints at the same time * fix lint * add R test, fix lint, update demo * Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping * Add Python test for interaction constraints * make R interaction constraints parameter based on feature index instead of column names, fix R coding style * Fix lint * Add BlueTea88 to CONTRIBUTORS.md * Short circuit when no constraint is specified; address review comments * Add tutorial for feature interaction constraints * allow interaction constraints to be passed as string, remove redundant column_names argument * Fix typo * Address review comments * Add comments to Python test	2018-09-04 09:35:39 -07:00
Andy Adinets	dee0b69674	Fixed copy constructor for HostDeviceVectorImpl. (#3657 ) - previously, vec_ in DeviceShard wasn't updated on copy; as a result, the shards continued to refer to the old HostDeviceVectorImpl object, which resulted in a dangling pointer once that object was deallocated	2018-09-01 11:38:09 +12:00
Philip Hyunsu Cho	86d88c0758	Fix #3648 : XGBClassifier.predict() should return margin scores when output_margin=True (#3651 ) * Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True * Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True) * Fix flaky test test_with_sklearn.test_sklearn_api_gblinear	2018-08-30 21:05:05 -07:00
Andy Adinets	72cd1517d6	Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. (#3446 ) * Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. - added distributions to HostDeviceVector - using HostDeviceVector for labels, weights and base margings in MetaInfo - using HostDeviceVector for offset and data in SparsePage - other necessary refactoring * Added const version of HostDeviceVector API calls. - const versions added to calls that can trigger data transfers, e.g. DevicePointer() - updated the code that uses HostDeviceVector - objective functions now accept const HostDeviceVector<bst_float>& for predictions * Updated src/linear/updater_gpu_coordinate.cu. * Added read-only state for HostDeviceVector sync. - this means no copies are performed if both host and devices access the HostDeviceVector read-only * Fixed linter and test errors. - updated the lz4 plugin - added ConstDeviceSpan to HostDeviceVector - using device % dh::NVisibleDevices() for the physical device number, e.g. in calls to cudaSetDevice() * Fixed explicit template instantiation errors for HostDeviceVector. - replaced HostDeviceVector<unsigned int> with HostDeviceVector<int> * Fixed HostDeviceVector tests that require multiple GPUs. - added a mock set device handler; when set, it is called instead of cudaSetDevice()	2018-08-30 14:28:47 +12:00
Andy Adinets	58d783df16	Fixed issue 3605. (#3628 ) * Fixed issue 3605. - https://github.com/dmlc/xgboost/issues/3605 * Fixed the bug in a better way. * Added a test to catch the bug. * Fixed linter errors.	2018-08-28 10:50:52 -07:00
Rory Mitchell	78bea0d204	Add google test for a column sampling, restore metainfo tests (#3637 ) * Add google test for a column sampling, restore metainfo tests * Update metainfo test for visual studio * Fix multi-GPU bug introduced in #3635	2018-08-28 16:10:26 +12:00
trivialfis	60787ecebc	Merge generic device helper functions into gpu set. (#3626 ) * Remove the use of old NDevices* functions. * Use GPUSet in timer.h.	2018-08-26 18:14:23 +12:00
Shiki-H	24a268a2e3	sklearn api for ranking (#3560 ) * added xgbranker * fixed predict method and ranking test * reformatted code in accordance with pep8 * fixed lint error * fixed docstring and added checks on objective * added ranking demo for python * fixed suffix in rank.py	2018-08-21 08:26:48 -07:00
trivialfis	cf2d86a4f6	Add travis sanitizers tests. (#3557 ) * Add travis sanitizers tests. * Add gcc-7 in Travis. * Add SANITIZER_PATH for CMake. * Enable sanitizer tests in Travis. * Fix memory leaks in tests. * Fix all memory leaks reported by Address Sanitizer. * tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.	2018-08-19 16:40:30 +12:00
trivialfis	2c502784ff	Span class. (#3548 ) * Add basic Span class based on ISO++20. * Use Span<Entry const> instead of Inst in SparsePage. * Add DeviceSpan in HostDeviceVector, use it in regression obj.	2018-08-14 17:58:11 +12:00
Rory Mitchell	bbb771f32e	Refactor parts of fast histogram utilities (#3564 ) * Refactor parts of fast histogram utilities * Removed byte packing from column matrix	2018-08-09 17:59:57 +12:00
Philip Hyunsu Cho	3c72654e3b	Revert "Fix #3485 , #3540 : Don't use dropout for predicting test sets" (#3563 ) * Revert "Fix #3485, #3540: Don't use dropout for predicting test sets (#3556)" This reverts commit `44811f2330`. * Document behavior of predict() for DART booster * Add notice to parameter.rst	2018-08-08 09:48:55 -07:00

1 2 3 4 5

233 Commits