xgboost

Author	SHA1	Message	Date
Jiaming Yuan	f0064c07ab	Refactor configuration [Part II]. (#4577 ) * Refactor configuration [Part II]. * General changes: Remove `Init` methods to avoid ambiguity. Remove `Configure(std::map<>)` to avoid redundant copying and prepare for parameter validation. (`std::vector` is returned from `InitAllowUnknown`). ** Add name to tree updaters for easier debugging. * Learner changes: Make `LearnerImpl` the only source of configuration. All configurations are stored and carried out by `LearnerImpl::Configure()`. Remove booster in C API. Originally kept for "compatibility reason", but did not state why. So here we just remove it. Add a `metric_names_` field in `LearnerImpl`. Remove `LazyInit`. Configuration will always be lazy. ** Run `Configure` before every iteration. * Predictor changes: Allocate both cpu and gpu predictor. Remove cpu_predictor from gpu_predictor. `GBTree` is now used to dispatch the predictor. ** Remove some GPU Predictor tests. * IO No IO changes. The binary model format stability is tested by comparing hashing value of save models between two commits	2019-07-20 08:34:56 -04:00
Matvey Turkov	61f764946f	fixed year to 2019 in conf.py, helpers.h and LICENSE (#4661 )	2019-07-15 12:29:12 -04:00
sriramch	7a388cbf8b	Modify caching allocator/vector and fix issues relating to inability to train large datasets (#4615 )	2019-07-09 18:33:27 +12:00
Jiaming Yuan	d9a47794a5	Fix CPU hist init for sparse dataset. (#4625 ) * Fix CPU hist init for sparse dataset. * Implement sparse histogram cut. * Allow empty features. * Fix windows build, don't use sparse in distributed environment. * Comments. * Smaller threshold. * Fix windows omp. * Fix msvc lambda capture. * Fix MSVC macro. * Fix MSVC initialization list. * Fix MSVC initialization list x2. * Preserve categorical feature behavior. * Rename matrix to sparse cuts. * Reuse UseGroup. * Check for categorical data when adding cut. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * Sanity check. * Fix comments. * Fix comment.	2019-07-04 16:27:03 -07:00
Philip Hyunsu Cho	96bf91725b	Support ndcg- and map- (#4635 )	2019-07-03 22:51:48 -07:00
Jiaming Yuan	45876bf41b	Fix external memory for get column batches. (#4622 ) * Fix external memory for get column batches. This fixes two bugs: * Use PushCSC for get column batches. * Don't remove the created temporary directory before finishing test. * Check all pages.	2019-06-30 09:56:49 +08:00
Rong Ou	63ec95623d	fix gpu predictor when dmatrix is mismatched with model (#4613 )	2019-06-28 11:03:02 +12:00
Egor Smirnov	4d6590be3c	Optimize ‘hist’ for multi-core CPU (#4529 ) * Initial performance optimizations for xgboost * remove includes * revert float->double * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * Check existence of _mm_prefetch and __builtin_prefetch * Fix lint * optimizations for CPU * appling comments in review * add some comments, code refactoring * fixing issues in CI * adding runtime checks * remove 1 extra check * remove extra checks in BuildHist * remove checks * add debug info * added debug info * revert changes * added comments * Apply suggestions from code review Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * apply review comments * Remove unused function CreateNewNodes() * Add descriptive comment on node_idx variable in QuantileHistMaker::Builder::BuildHistsBatch()	2019-06-27 11:33:49 -07:00
Jiaming Yuan	8bdf15120a	Implement tree model dump with code generator. (#4602 ) * Implement tree model dump with a code generator. * Split up generators. * Implement graphviz generator. * Use pattern matching. * [Breaking] Return a Source in `to_graphviz` instead of Digraph in Python package. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-06-26 15:20:44 +08:00
Rong Ou	6125521caf	fix compiler warning (#4588 )	2019-06-21 04:06:26 +08:00
Rory Mitchell	221e163185	Refactor out row partitioning logic from gpu_hist, introduce caching device vectors (#4554 )	2019-06-20 18:24:09 +12:00
Jiaming Yuan	ae05948e32	Feature interaction for GPU Hist. (#4534 ) * GPU hist Interaction Constraints. * Duplicate related parameters. * Add tests for CPU interaction constraint. * Add better error reporting. * Thorough tests.	2019-06-19 18:11:02 +08:00
sriramch	6757654337	Optimizations for quantisation on device (#4572 ) * - do not create device vectors for the entire sparse page while computing histograms... - while creating the compressed histogram indices, the row vector is created for the entire sparse page batch. this is needless as we only process chunks at a time based on a slice of the total gpu memory - this pr will allocate only as much as required to store the ppropriate row indices and the entries * - do not dereference row_ptrs once the device_vector has been created to elide host copies of those counts - instead, grab the entry counts directly from the sparsepage	2019-06-19 10:50:25 +12:00
sriramch	90f683b25b	Set the appropriate device before freeing device memory... (#4566 ) * - set the appropriate device before freeing device memory... - pr #4532 added a global memory tracker/logger to keep track of number of (de)allocations and peak memory usage on a per device basis. - this pr adds the appropriate check to make sure that the (de)allocation counts and memory usages makes sense for the device. since verbosity is typically increased on debug/non-retail builds. * - pre-create cub allocators and reuse them - create them once and not resize them dynamically. we need to ensure that these allocators are created and destroyed exactly once so that the appropriate device id's are set	2019-06-18 14:58:05 +12:00
Jiaming Yuan	c5719cc457	Offload some configurations into GBM. (#4553 ) This is part 1 of refactoring configuration. * Move tree heuristic configurations. * Split up declarations and definitions for GBTree. * Implement UseGPU in gbm.	2019-06-14 09:18:51 +08:00
sriramch	a2042b685a	- training with external memory - part 2 of 2 (#4526 ) * - training with external memory - part 2 of 2 - when external memory support is enabled, building of histogram indices are done incrementally for every sparse page - the entire set of input data is divided across multiple gpu's and the relative row positions within each device is tracked when building the compressed histogram buffer - this was tested using a mortgage dataset containing ~ 670m rows before 4xt4's could be saturated	2019-06-12 09:52:56 +12:00
Jiaming Yuan	2f1319f273	Add `rmsle` metric and `reg:squaredlogerror` objective (#4541 )	2019-06-11 05:48:27 +08:00
Rory Mitchell	9683fd433e	Overload device memory allocation (#4532 ) * Group source files, include headers in source files * Overload device memory allocation	2019-06-10 11:35:13 +12:00
Jiaming Yuan	da21ac0cc2	Fix tweedie metric string. (#4543 )	2019-06-09 09:52:29 +08:00
Philip Hyunsu Cho	3f2fe25a32	Fix C++11 config parser (#4521 ) * Fix C++11 config parser * Use raw strings to improve readability of regex * Fix compilation for GCC 5.x Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2019-06-03 22:18:16 +08:00
Rory Mitchell	23a10c8339	Refactor histogram building code for gpu_hist (#4528 )	2019-06-03 09:50:10 +12:00
Rory Mitchell	fbbae3386a	Smarter choice of histogram construction for distributed gpu_hist (#4519 ) * Smarter choice of histogram construction for distributed gpu_hist * Limit omp team size in ExecuteShards	2019-05-31 14:11:34 +12:00
sriramch	fed665ae8a	- training with external memory part 1 of 2 (#4486 ) * - training with external memory part 1 of 2 - this pr focuses on computing the quantiles using multiple gpus on a dataset that uses the external cache capabilities - there will a follow-up pr soon after this that will support creation of histogram indices on large dataset as well - both of these changes are required to support training with external memory - the sparse pages in dmatrix are taken in batches and the the cut matrices are incrementally built - also snuck in some (perf) changes related to sketches aggregation amongst multiple features across multiple sparse page batches. instead of aggregating the summary inside each device and merged later, it is aggregated in-place when the device is working on different rows but the same feature	2019-05-30 08:18:34 +12:00
sriramch	6e16900711	Fix crash with approx tree method on cpu (#4510 )	2019-05-30 01:11:29 +08:00
Jiaming Yuan	c589eff941	De-duplicate GPU parameters. (#4454 ) * Only define `gpu_id` and `n_gpus` in `LearnerTrainParam` * Pass LearnerTrainParam through XGBoost vid factory method. * Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively). * Test learner train param io. * Fix gpu pickling.	2019-05-29 11:55:57 +08:00
sriramch	a3fedbeaa8	- fix issues with training with external memory on cpu (#4487 ) * - fix issues with training with external memory on cpu - use the batch size to determine the correct number of rows in a batch - use the right number of threads in omp parallalization if the batch size is less than the default omp max threads (applicable for the last batch) * - handle scenarios where last batch size is < available number of threads - augment tests such that we can test all scenarios (batch size <, >, = number of threads)	2019-05-29 12:31:30 +12:00
Philip Hyunsu Cho	cf2400036e	[CI] Add Python and C++ tests for Windows GPU target (#4469 ) * Add CMake option to use bundled gtest from dmlc-core, so that it is easy to build XGBoost with gtest on Windows * Consistently apply OpenMP flag to all targets. Force enable OpenMP when USE_CUDA is turned on. * Insert vcomp140.dll into Windows wheels * Add C++ and Python tests for CPU and GPU targets (CUDA 9.0, 10.0, 10.1) * Prevent spurious msbuild failure * Add GPU tests * Upgrade dmlc-core	2019-05-16 01:06:46 +00:00
Rong Ou	be0f346ec9	mgpu predictor using explicit offsets (#4438 ) * mgpu prediction using explicit sharding	2019-05-11 09:35:06 +12:00
Rong Ou	feb6ae3e18	Initial support for external memory in gpu_predictor (#4284 )	2019-05-03 13:01:27 +12:00
Philip Hyunsu Cho	bfddc2c42c	Make CMakeLists.txt compatible with CMake 3.3 (#4420 ) * Make CMakeLists.txt compatible with CMake 3.3; require CMake 3.11 for MSVC * Use CMake 3.12 when sanitizer is enabled * Disable funroll-loops for MSVC * Use cmake version in container name * Add missing arg * Fix egrep use in ci_build.sh * Display CMake version * Do not set OpenMP_CXX_LIBRARIES for MSVC * Use cmake_minimum_required()	2019-05-02 11:49:32 +08:00
Rong Ou	eaab364a63	More explict sharding methods for device memory (#4396 ) * Rename the Reshard method to Shard * Add a new Reshard method for sharding a vector that's already sharded	2019-05-01 11:47:22 +12:00
Rory Mitchell	5e582b0fa7	Combine thread launches into single launch per tree for gpu_hist (#4343 ) * Combine thread launches into single launch per tree for gpu_hist algorithm. * Address deprecation warning * Add manual column sampler constructor * Turn off omp dynamic to get a guaranteed number of threads * Enable openmp in cuda code	2019-04-29 09:58:34 +12:00
Jiaming Yuan	77c03538b0	Fix node reuse. (#4404 ) * Reinitialize `_sindex` when reallocating a deleted node.	2019-04-27 13:03:23 +08:00
Jiaming Yuan	207f058711	Refactor CMake scripts. (#4323 ) * Refactor CMake scripts. * Remove CMake CUDA wrapper. * Bump CMake version for CUDA. * Use CMake to handle Doxygen. * Split up CMakeList. * Export install target. * Use modern CMake. * Remove build.sh * Workaround for gpu_hist test. * Use cmake 3.12. * Revert machine.conf. * Move CLI test to gpu. * Small cleanup. * Support using XGBoost as submodule. * Fix windows * Fix cpp tests on Windows * Remove duplicated find_package.	2019-04-15 10:08:12 -07:00
Jiaming Yuan	84d992babc	GPU multiclass metrics (#4368 ) * Port multi classes metrics to CUDA.	2019-04-15 17:47:47 +08:00
Rong Ou	f4521bf6aa	refactor tests to get rid of duplication (#4358 ) * refactor tests to get rid of duplication * address review comments	2019-04-12 00:21:48 -07:00
Jiaming Yuan	5c2575535f	Fix Histogram allocation. (#4347 ) * Fix Histogram allocation. nidx_map is cleared after `Reset`, but histogram data size isn't changed hence histogram recycling is used in later iterations. After a reset(building new tree), newly allocated node will start from 0, while recycling always choose the node with smallest index, which happens to be our newly allocated node 0.	2019-04-10 19:21:26 +08:00
Rong Ou	81c1cd40ca	add a test for cpu predictor using external memory (#4308 ) * add a test for cpu predictor using external memory * allow different page size for testing	2019-04-10 13:25:10 +12:00
Rory Mitchell	3f312e30db	Retire DVec class in favour of c++20 style span for device memory. (#4293 )	2019-03-28 13:59:58 +13:00
Rory Mitchell	6d5b34d824	Further optimisations for gpu_hist. (#4283 ) - Fuse final update position functions into a single more efficient kernel - Refactor gpu_hist with a more explicit ellpack matrix representation	2019-03-24 17:17:22 +13:00
Rory Mitchell	00465d243d	Optimisations for gpu_hist. (#4248 ) * Optimisations for gpu_hist. * Use streams to overlap operations. * ColumnSampler now uses HostDeviceVector to prevent repeatedly copying feature vectors to the device.	2019-03-20 13:30:06 +13:00
Jiaming Yuan	29a1356669	Deprecate `reg:linear' in favor of` reg:squarederror'. (#4267 ) * Deprecate `reg:linear' in favor of `reg:squarederror'. * Replace the use of `reg:linear'. * Replace the use of `silent`.	2019-03-17 17:55:04 +08:00
Jiaming Yuan	7b9043cf71	Fix clang-tidy warnings. (#4149 ) * Upgrade gtest for clang-tidy. * Use CMake to install GTest instead of mv. * Don't enforce clang-tidy to return 0 due to errors in thrust. * Add a small test for tidy itself. * Reformat.	2019-03-13 02:25:51 +08:00
Jiaming Yuan	7ea5675679	Add PushCSC for SparsePage. (#4193 ) * Add PushCSC for SparsePage. * Move Push* definitions into cc file. * Add std:: prefix to `size_t` make clang++ happy. * Address monitor count == 0.	2019-03-02 01:58:08 +08:00
Philip Hyunsu Cho	549c8d6ae9	Prevent empty quantiles in fast hist (#4155 ) * Prevent empty quantiles * Revise and improve unit tests for quantile hist * Remove unnecessary comment * Add #2943 as a test case * Skip test if no sklearn * Revise misleading comments	2019-02-17 16:01:07 -08:00
Jiaming Yuan	e1240413c9	Fix gpu_hist apply_split test. (#4158 )	2019-02-18 02:48:28 +08:00
Jiaming Yuan	1fe874e58a	Fix empty subspan. (#4151 ) * Silent the death tests.	2019-02-17 04:48:03 +08:00
Jiaming Yuan	754fe8142b	Make `HistCutMatrix::Init' be aware of groups. (#4115 ) * Add checks for group size. * Simple docs. * Search group index during hist cut matrix initialization. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-02-16 04:39:41 +08:00
Nan Zhu	c18a3660fa	Separate Depthwidth and Lossguide growing policy in fast histogram (#4102 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * more changes * temp * update * udpate rabit * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * broadcast subsampled feature correctly * init col * temp * col sampling * fix histmastrix init * fix col sampling * remove cout * fix out of bound access * fix core dump remove core dump file * disbale test temporarily * update * add fid * print perf data * update * revert some changes * temp * temp * pass all tests * bring back some tests * recover some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * recover column init part * more recovery * fix core dumps * code clean * revert some changes * fix test compilation issue * fix lint issue * resolve compilation issue * fix issues of lint caused by rebase * fix stylistic changes and change variable names * use regtree internal function * modularize depth width * address the comments * fix failed tests * wrap perf timers with class * fix lint * fix num_leaves count * fix indention * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.h Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.h Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * merge * fix compilation	2019-02-13 12:56:19 -08:00
Jiaming Yuan	f8ca2960fc	Use nccl group calls to prevent from dead lock. (#4113 ) * launch all reduce sequentially. * Fix gpu_exact test memory leak.	2019-02-08 06:12:39 +08:00

1 2 3

140 Commits