xgboost

Author	SHA1	Message	Date
Jiaming Yuan	7663de956c	Run training with empty DMatrix. (#4990 ) This makes GPU Hist robust in distributed environment as some workers might not be associated with any data in either training or evaluation. * Disable rabit mock test for now: See #5012 . * Disable dask-cudf test at prediction for now: See #5003 * Launch dask job for all workers despite they might not have any data. * Check 0 rows in elementwise evaluation metrics. Using AUC and AUC-PR still throws an error. See #4663 for a robust fix. * Add tests for edge cases. * Add `LaunchKernel` wrapper handling zero sized grid. * Move some parts of allreducer into a cu file. * Don't validate feature names when the booster is empty. * Sync number of columns in DMatrix. As num_feature is required to be the same across all workers in data split mode. * Filtering in dask interface now by default syncs all booster that's not empty, instead of using rank 0. * Fix Jenkins' GPU tests. * Install dask-cuda from source in Jenkins' test. Now all tests are actually running. * Restore GPU Hist tree synchronization test. * Check UUID of running devices. The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field. * Fix CMake policy and project variables. Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13. * Fix copying data to CPU * Fix race condition in cpu predictor. * Fix duplicated DMatrix construction. * Don't download extra nccl in CI script.	2019-11-06 16:13:13 +08:00
Christopher Cowden	807a244517	Fix repeated split and 0 cover nodes (#5010 )	2019-11-06 14:57:22 +08:00
Jiaming Yuan	6ec7e300bd	Fix external memory race in colmaker. (#4980 ) * Move `GetColDensity` out of omp parallel block.	2019-10-25 04:11:13 -04:00
Jiaming Yuan	ac457c56a2	Use `UpdateAllowUnknown' for non-model related parameter. (#4961 ) * Use `UpdateAllowUnknown' for non-model related parameter. Model parameter can not pack an additional boolean value due to binary IO format. This commit deals only with non-model related parameter configuration. * Add tidy command line arg for use-dmlc-gtest.	2019-10-23 05:50:12 -04:00
Rong Ou	5b1715d97c	Write ELLPACK pages to disk (#4879 ) * add ellpack source * add batch param * extract function to parse cache info * construct ellpack info separately * push batch to ellpack page * write ellpack page. * make sparse page source reusable	2019-10-22 23:44:32 -04:00
sriramch	310fe60b35	Pairwise ranking objective implementation on gpu (#4873 ) * - pairwise ranking objective implementation on gpu - there are couple of more algorithms (ndcg and map) for which support will be added as follow-up pr's - with no label groups defined, get gradient is 90x faster on gpu (120m instance mortgage dataset) - it can perform by an order of magnitude faster with ~ 10 groups (and adequate cores for the cpu implementation) * Add JSON config to rank obj.	2019-10-22 23:40:07 -04:00
Jiaming Yuan	31030a8d3a	Set correct file permission. (#4964 )	2019-10-18 12:54:29 -04:00
Jiaming Yuan	ae536756ae	Add Model and Configurable interface. (#4945 ) * Apply Configurable to objective functions. * Apply Model to Learner and Regtree, gbm. * Add Load/SaveConfig to objs. * Refactor obj tests to use smart pointer. * Dummy methods for Save/Load Model.	2019-10-18 01:56:02 -04:00
Rory Mitchell	60748b2071	Use heuristic to select histogram node, avoid rabit call (#4951 )	2019-10-18 11:33:54 +13:00
Jiaming Yuan	095de3bf5f	Export c++ headers in CMake installation. (#4897 ) * Move get transpose into cc. * Clean up headers in host device vector, remove thrust dependency. * Move span and host device vector into public. * Install c++ headers. * Short notes for c and c++. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-10-06 23:53:09 -04:00
Rong Ou	562bb0ae31	remove device shards (#4867 )	2019-09-25 13:15:46 +08:00
Jiaming Yuan	0b89cd1dfa	Support gamma in GPU_Hist. (#4874 ) * Just prevent building the tree instead of using an explicit pruner.	2019-09-24 10:16:08 +08:00
Rong Ou	125bcec62e	Move ellpack page construction into DMatrix (#4833 )	2019-09-16 23:50:55 -04:00
Xu Xiao	c89bcc4de5	[blocking] fix parallel eval_split of hist updater (#4851 ) * Don't call rabit functions inside parallel loop.	2019-09-13 09:35:03 -04:00
Jiaming Yuan	f90e7f9aa8	Some comments for row partitioner. (#4832 )	2019-09-06 03:01:42 -04:00
Jiaming Yuan	c0fbeff0ab	Restrict access to `cfg_` in gbm. (#4801 ) * Restrict access to `cfg_` in gbm. * Verify having correct updaters. * Remove `grow_global_histmaker` This updater is the same as `grow_histmaker`. The former is not in our document so we just remove it.	2019-09-02 00:43:19 -04:00
Rong Ou	733ed24dd9	further cleanup of single process multi-GPU code (#4810 ) * use subspan in gpu predictor instead of copying * Revise `HostDeviceVector`	2019-08-30 05:27:23 -04:00
Rong Ou	38ab79f889	Make HostDeviceVector single gpu only (#4773 ) * Make HostDeviceVector single gpu only	2019-08-26 09:51:13 +12:00
Jiaming Yuan	9700776597	Cudf support. (#4745 ) * Initial support for cudf integration. * Add two C APIs for consuming data and metainfo. * Add CopyFrom for SimpleCSRSource as a generic function to consume the data. * Add FromDeviceColumnar for consuming device data. * Add new MetaInfo::SetInfo for consuming label, weight etc.	2019-08-19 16:51:40 +12:00
Jiaming Yuan	c358d95c44	Remove initializing stringstream reference. (#4788 )	2019-08-18 09:59:47 -04:00
Jiaming Yuan	c81238b5c4	Clean up after removing `gpu_exact`. (#4777 ) * Removed unused functions. * Removed unused parameters. * Move ValueConstraints into constraints.cuh since it's now only used in GPU_Hist.	2019-08-17 01:05:57 -04:00
Xu Xiao	ef9af33a00	[HOTFIX] distributed training with hist method (#4716 ) * add parallel test for hist.EvalualiteSplit * update test_openmp.py * update test_openmp.py * update test_openmp.py * update test_openmp.py * update test_openmp.py * fix OMP schedule policy * fix clang-tidy * add logging: total_num_bins * fix * fix * test * replace guided OPENMP policy with static in updater_quantile_hist.cc	2019-08-13 11:27:29 -07:00
Rong Ou	602484e19f	Remove some unused functions as reported by cppcheck (#4743 )	2019-08-07 02:42:33 -04:00
Rong Ou	851b5b3808	Remove gpu_exact tree method (#4742 )	2019-08-07 11:43:20 +12:00
Jiaming Yuan	9c469b3844	Move bitfield into common. (#4737 ) * Prepare for columnar format support.	2019-08-06 02:49:32 -04:00
Rong Ou	6edddd7966	Refactor DMatrix to return batches of different page types (#4686 ) * Use explicit template parameter for specifying page type.	2019-08-03 15:10:34 -04:00
Jiaming Yuan	f0064c07ab	Refactor configuration [Part II]. (#4577 ) * Refactor configuration [Part II]. * General changes: Remove `Init` methods to avoid ambiguity. Remove `Configure(std::map<>)` to avoid redundant copying and prepare for parameter validation. (`std::vector` is returned from `InitAllowUnknown`). ** Add name to tree updaters for easier debugging. * Learner changes: Make `LearnerImpl` the only source of configuration. All configurations are stored and carried out by `LearnerImpl::Configure()`. Remove booster in C API. Originally kept for "compatibility reason", but did not state why. So here we just remove it. Add a `metric_names_` field in `LearnerImpl`. Remove `LazyInit`. Configuration will always be lazy. ** Run `Configure` before every iteration. * Predictor changes: Allocate both cpu and gpu predictor. Remove cpu_predictor from gpu_predictor. `GBTree` is now used to dispatch the predictor. ** Remove some GPU Predictor tests. * IO No IO changes. The binary model format stability is tested by comparing hashing value of save models between two commits	2019-07-20 08:34:56 -04:00
sriramch	7a388cbf8b	Modify caching allocator/vector and fix issues relating to inability to train large datasets (#4615 )	2019-07-09 18:33:27 +12:00
Jiaming Yuan	d9a47794a5	Fix CPU hist init for sparse dataset. (#4625 ) * Fix CPU hist init for sparse dataset. * Implement sparse histogram cut. * Allow empty features. * Fix windows build, don't use sparse in distributed environment. * Comments. * Smaller threshold. * Fix windows omp. * Fix msvc lambda capture. * Fix MSVC macro. * Fix MSVC initialization list. * Fix MSVC initialization list x2. * Preserve categorical feature behavior. * Rename matrix to sparse cuts. * Reuse UseGroup. * Check for categorical data when adding cut. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * Sanity check. * Fix comments. * Fix comment.	2019-07-04 16:27:03 -07:00
Egor Smirnov	4d6590be3c	Optimize ‘hist’ for multi-core CPU (#4529 ) * Initial performance optimizations for xgboost * remove includes * revert float->double * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * Check existence of _mm_prefetch and __builtin_prefetch * Fix lint * optimizations for CPU * appling comments in review * add some comments, code refactoring * fixing issues in CI * adding runtime checks * remove 1 extra check * remove extra checks in BuildHist * remove checks * add debug info * added debug info * revert changes * added comments * Apply suggestions from code review Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * apply review comments * Remove unused function CreateNewNodes() * Add descriptive comment on node_idx variable in QuantileHistMaker::Builder::BuildHistsBatch()	2019-06-27 11:33:49 -07:00
Jiaming Yuan	8bdf15120a	Implement tree model dump with code generator. (#4602 ) * Implement tree model dump with a code generator. * Split up generators. * Implement graphviz generator. * Use pattern matching. * [Breaking] Return a Source in `to_graphviz` instead of Digraph in Python package. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-06-26 15:20:44 +08:00
Jiaming Yuan	fdf27a5b82	Fix race condition in interaction constraint. (#4587 ) * Split up the kernel to sync write. * QueryNode is no-longer used in Query, but kept for testing.	2019-06-21 02:47:48 +08:00
Rory Mitchell	221e163185	Refactor out row partitioning logic from gpu_hist, introduce caching device vectors (#4554 )	2019-06-20 18:24:09 +12:00
Jiaming Yuan	ae05948e32	Feature interaction for GPU Hist. (#4534 ) * GPU hist Interaction Constraints. * Duplicate related parameters. * Add tests for CPU interaction constraint. * Add better error reporting. * Thorough tests.	2019-06-19 18:11:02 +08:00
sriramch	6757654337	Optimizations for quantisation on device (#4572 ) * - do not create device vectors for the entire sparse page while computing histograms... - while creating the compressed histogram indices, the row vector is created for the entire sparse page batch. this is needless as we only process chunks at a time based on a slice of the total gpu memory - this pr will allocate only as much as required to store the ppropriate row indices and the entries * - do not dereference row_ptrs once the device_vector has been created to elide host copies of those counts - instead, grab the entry counts directly from the sparsepage	2019-06-19 10:50:25 +12:00
Jiaming Yuan	c5719cc457	Offload some configurations into GBM. (#4553 ) This is part 1 of refactoring configuration. * Move tree heuristic configurations. * Split up declarations and definitions for GBTree. * Implement UseGPU in gbm.	2019-06-14 09:18:51 +08:00
sriramch	a2042b685a	- training with external memory - part 2 of 2 (#4526 ) * - training with external memory - part 2 of 2 - when external memory support is enabled, building of histogram indices are done incrementally for every sparse page - the entire set of input data is divided across multiple gpu's and the relative row positions within each device is tracked when building the compressed histogram buffer - this was tested using a mortgage dataset containing ~ 670m rows before 4xt4's could be saturated	2019-06-12 09:52:56 +12:00
Rory Mitchell	9683fd433e	Overload device memory allocation (#4532 ) * Group source files, include headers in source files * Overload device memory allocation	2019-06-10 11:35:13 +12:00
Rory Mitchell	23a10c8339	Refactor histogram building code for gpu_hist (#4528 )	2019-06-03 09:50:10 +12:00
Rory Mitchell	399fabed49	Deprecate gpu_exact, bump required cuda version in docs (#4527 )	2019-06-03 09:49:05 +12:00
Rory Mitchell	fbbae3386a	Smarter choice of histogram construction for distributed gpu_hist (#4519 ) * Smarter choice of histogram construction for distributed gpu_hist * Limit omp team size in ExecuteShards	2019-05-31 14:11:34 +12:00
sriramch	fed665ae8a	- training with external memory part 1 of 2 (#4486 ) * - training with external memory part 1 of 2 - this pr focuses on computing the quantiles using multiple gpus on a dataset that uses the external cache capabilities - there will a follow-up pr soon after this that will support creation of histogram indices on large dataset as well - both of these changes are required to support training with external memory - the sparse pages in dmatrix are taken in batches and the the cut matrices are incrementally built - also snuck in some (perf) changes related to sketches aggregation amongst multiple features across multiple sparse page batches. instead of aggregating the summary inside each device and merged later, it is aggregated in-place when the device is working on different rows but the same feature	2019-05-30 08:18:34 +12:00
Jiaming Yuan	c589eff941	De-duplicate GPU parameters. (#4454 ) * Only define `gpu_id` and `n_gpus` in `LearnerTrainParam` * Pass LearnerTrainParam through XGBoost vid factory method. * Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively). * Test learner train param io. * Fix gpu pickling.	2019-05-29 11:55:57 +08:00
Philip Hyunsu Cho	17df5fd296	Simplify bound checking in feature interaction constraints (#4428 )	2019-05-01 16:59:53 -07:00
Xu Xiao	4c74336384	Use feature interaction constraints to narrow search space for split candidates (#4341 ) * Use feature interaction constraints to narrow search space for split candidates. * fix clang-tidy broken at updater_quantile_hist.cc:535:3 * make const * fix * try to fix exception thrown in java_test * fix suspected mistake which cause EvaluateSplit error * try fix * Fix bug: feature ID and node ID swapped in argument * Rename CheckValidation() to CheckFeatureConstraint() for clarity * Do not create temporary vector validFeatures, to enable parallelism	2019-04-30 20:59:58 -07:00
Philip Hyunsu Cho	ba98e0cdf2	Add additional Python tests to test training under constraints (#4426 )	2019-04-30 18:23:39 -07:00
Rong Ou	eaab364a63	More explict sharding methods for device memory (#4396 ) * Rename the Reshard method to Shard * Add a new Reshard method for sharding a vector that's already sharded	2019-05-01 11:47:22 +12:00
Rory Mitchell	5e582b0fa7	Combine thread launches into single launch per tree for gpu_hist (#4343 ) * Combine thread launches into single launch per tree for gpu_hist algorithm. * Address deprecation warning * Add manual column sampler constructor * Turn off omp dynamic to get a guaranteed number of threads * Enable openmp in cuda code	2019-04-29 09:58:34 +12:00
Egor Smirnov	711397d645	Optimizations of pre-processing for 'hist' tree method (#4310 ) * oprimizations for pre-processing * code cleaning * code cleaning * code cleaning after review * Apply suggestions from code review Co-Authored-By: SmirnovEgorRu <egor.smirnov@intel.com>	2019-04-16 17:36:19 -07:00
Jiaming Yuan	207f058711	Refactor CMake scripts. (#4323 ) * Refactor CMake scripts. * Remove CMake CUDA wrapper. * Bump CMake version for CUDA. * Use CMake to handle Doxygen. * Split up CMakeList. * Export install target. * Use modern CMake. * Remove build.sh * Workaround for gpu_hist test. * Use cmake 3.12. * Revert machine.conf. * Move CLI test to gpu. * Small cleanup. * Support using XGBoost as submodule. * Fix windows * Fix cpp tests on Windows * Remove duplicated find_package.	2019-04-15 10:08:12 -07:00

1 2 3 4 5 ...

353 Commits