xgboost

Author	SHA1	Message	Date
fuhaoda	dd60fc23e6	Simplify INI-style config reader using C++11 STL (#4478 ) * simplify the config.h file * revise config.h * revised config.h * revise format * revise format issues * revise whitespace issues * revise whitespace namespace format issues * revise namespace format issues * format issues * format issues * format issues * format issues * Revert submodule changes * minor change * Update src/common/config.h Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * address format issue from trivialfis * Use correct cub submodule	2019-05-30 11:57:56 -07:00
Jiaming Yuan	b48f895027	Fix prediction from loaded pickle. (#4516 )	2019-05-30 15:05:09 +08:00
sriramch	fed665ae8a	- training with external memory part 1 of 2 (#4486 ) * - training with external memory part 1 of 2 - this pr focuses on computing the quantiles using multiple gpus on a dataset that uses the external cache capabilities - there will a follow-up pr soon after this that will support creation of histogram indices on large dataset as well - both of these changes are required to support training with external memory - the sparse pages in dmatrix are taken in batches and the the cut matrices are incrementally built - also snuck in some (perf) changes related to sketches aggregation amongst multiple features across multiple sparse page batches. instead of aggregating the summary inside each device and merged later, it is aggregated in-place when the device is working on different rows but the same feature	2019-05-30 08:18:34 +12:00
sriramch	6e16900711	Fix crash with approx tree method on cpu (#4510 )	2019-05-30 01:11:29 +08:00
Jiaming Yuan	c589eff941	De-duplicate GPU parameters. (#4454 ) * Only define `gpu_id` and `n_gpus` in `LearnerTrainParam` * Pass LearnerTrainParam through XGBoost vid factory method. * Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively). * Test learner train param io. * Fix gpu pickling.	2019-05-29 11:55:57 +08:00
sriramch	a3fedbeaa8	- fix issues with training with external memory on cpu (#4487 ) * - fix issues with training with external memory on cpu - use the batch size to determine the correct number of rows in a batch - use the right number of threads in omp parallalization if the batch size is less than the default omp max threads (applicable for the last batch) * - handle scenarios where last batch size is < available number of threads - augment tests such that we can test all scenarios (batch size <, >, = number of threads)	2019-05-29 12:31:30 +12:00
Jiaming Yuan	55e645c5f5	Revert hist init optimization. (#4502 )	2019-05-26 08:57:41 +08:00
Bryan Woods	278562db13	Add support for cross-validation using query ID (#4474 ) * adding support for matrix slicing with query ID for cross-validation * hail mary test of unrar installation for windows tests * trying to modify tests to run in Github CI * Remove dependency on wget and unrar * Save error log from R test * Relax assertion in test_training * Use int instead of bool in C function interface * Revise R interface * Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility	2019-05-23 10:45:02 -07:00
Rong Ou	a9ec2dd295	only copy the model once when predicting multiple batches (#4457 )	2019-05-15 11:04:22 +12:00
Rong Ou	df2cdaca50	add cuda 10.1 support (#4468 )	2019-05-14 18:30:58 +00:00
Philip Hyunsu Cho	c6f2a7e186	[CI] Add Windows GPU to Jenkins CI pipeline (#4463 ) * Fix #4462: Use /MT flag consistently for MSVC target * First attempt at Windows CI * Distinguish stages in Linux and Windows pipelines * Try running CMake in Windows pipeline * Add build step	2019-05-14 04:45:06 +00:00
Rong Ou	be0f346ec9	mgpu predictor using explicit offsets (#4438 ) * mgpu prediction using explicit sharding	2019-05-11 09:35:06 +12:00
Jiaming Yuan	5de7e12704	Change obj name to `reg:squarederror` in learner. (#4427 ) * Change memory dump size in R test.	2019-05-06 21:35:35 +08:00
Xin Yin	8d1098a983	In AUC and AUCPR metrics, detect whether weights are per-instance or per-group (#4216 ) * In AUC and AUCPR metrics, detect whether weights are per-instance or per-group * Fix C++ style check * Add a test for weighted AUC	2019-05-04 00:53:04 -07:00
Philip Hyunsu Cho	9252b686ae	Make AUCPR work with multiple query groups (#4436 ) * Make AUCPR work with multiple query groups * Check AUCPR <= 1.0 in distributed setting	2019-05-03 10:34:44 -07:00
ras44	2be85fc62a	max_digits10 guarantees float decimal roundtrip (#4435 ) 2 additional digits are not needed to guarantee that casting the decimal representation will result in the same float, see https://github.com/dmlc/xgboost/issues/3980#issuecomment-458702440	2019-05-02 20:01:26 -07:00
Rong Ou	feb6ae3e18	Initial support for external memory in gpu_predictor (#4284 )	2019-05-03 13:01:27 +12:00
Philip Hyunsu Cho	bfddc2c42c	Make CMakeLists.txt compatible with CMake 3.3 (#4420 ) * Make CMakeLists.txt compatible with CMake 3.3; require CMake 3.11 for MSVC * Use CMake 3.12 when sanitizer is enabled * Disable funroll-loops for MSVC * Use cmake version in container name * Add missing arg * Fix egrep use in ci_build.sh * Display CMake version * Do not set OpenMP_CXX_LIBRARIES for MSVC * Use cmake_minimum_required()	2019-05-02 11:49:32 +08:00
Philip Hyunsu Cho	17df5fd296	Simplify bound checking in feature interaction constraints (#4428 )	2019-05-01 16:59:53 -07:00
Xu Xiao	4c74336384	Use feature interaction constraints to narrow search space for split candidates (#4341 ) * Use feature interaction constraints to narrow search space for split candidates. * fix clang-tidy broken at updater_quantile_hist.cc:535:3 * make const * fix * try to fix exception thrown in java_test * fix suspected mistake which cause EvaluateSplit error * try fix * Fix bug: feature ID and node ID swapped in argument * Rename CheckValidation() to CheckFeatureConstraint() for clarity * Do not create temporary vector validFeatures, to enable parallelism	2019-04-30 20:59:58 -07:00
Philip Hyunsu Cho	ba98e0cdf2	Add additional Python tests to test training under constraints (#4426 )	2019-04-30 18:23:39 -07:00
Rong Ou	eaab364a63	More explict sharding methods for device memory (#4396 ) * Rename the Reshard method to Shard * Add a new Reshard method for sharding a vector that's already sharded	2019-05-01 11:47:22 +12:00
Rory Mitchell	5e582b0fa7	Combine thread launches into single launch per tree for gpu_hist (#4343 ) * Combine thread launches into single launch per tree for gpu_hist algorithm. * Address deprecation warning * Add manual column sampler constructor * Turn off omp dynamic to get a guaranteed number of threads * Enable openmp in cuda code	2019-04-29 09:58:34 +12:00
Egor Smirnov	711397d645	Optimizations of pre-processing for 'hist' tree method (#4310 ) * oprimizations for pre-processing * code cleaning * code cleaning * code cleaning after review * Apply suggestions from code review Co-Authored-By: SmirnovEgorRu <egor.smirnov@intel.com>	2019-04-16 17:36:19 -07:00
Jiaming Yuan	207f058711	Refactor CMake scripts. (#4323 ) * Refactor CMake scripts. * Remove CMake CUDA wrapper. * Bump CMake version for CUDA. * Use CMake to handle Doxygen. * Split up CMakeList. * Export install target. * Use modern CMake. * Remove build.sh * Workaround for gpu_hist test. * Use cmake 3.12. * Revert machine.conf. * Move CLI test to gpu. * Small cleanup. * Support using XGBoost as submodule. * Fix windows * Fix cpp tests on Windows * Remove duplicated find_package.	2019-04-15 10:08:12 -07:00
Jiaming Yuan	84d992babc	GPU multiclass metrics (#4368 ) * Port multi classes metrics to CUDA.	2019-04-15 17:47:47 +08:00
Jiaming Yuan	5c2575535f	Fix Histogram allocation. (#4347 ) * Fix Histogram allocation. nidx_map is cleared after `Reset`, but histogram data size isn't changed hence histogram recycling is used in later iterations. After a reset(building new tree), newly allocated node will start from 0, while recycling always choose the node with smallest index, which happens to be our newly allocated node 0.	2019-04-10 19:21:26 +08:00
Rong Ou	81c1cd40ca	add a test for cpu predictor using external memory (#4308 ) * add a test for cpu predictor using external memory * allow different page size for testing	2019-04-10 13:25:10 +12:00
sriramch	2f7087eba1	Improve HostDeviceVector exception safety (#4301 ) * make the assignments of HostDeviceVector exception safe. * storing a dummy GPUDistribution instance in HDV for CPU based code. * change testxgboost binary location to build directory.	2019-03-31 22:48:58 +08:00
Hajime Morrita	680a1b36f3	Get rid of a few trivial compiler warnings. (#4312 )	2019-03-31 00:02:29 +08:00
Rory Mitchell	3f312e30db	Retire DVec class in favour of c++20 style span for device memory. (#4293 )	2019-03-28 13:59:58 +13:00
Rory Mitchell	6d5b34d824	Further optimisations for gpu_hist. (#4283 ) - Fuse final update position functions into a single more efficient kernel - Refactor gpu_hist with a more explicit ellpack matrix representation	2019-03-24 17:17:22 +13:00
Rong Ou	5aa42b5f11	jenkins build for cuda 10.0 (#4281 ) * jenkins build for cuda 10.0 * yum install nccl2 for cuda 10.0	2019-03-22 22:35:18 -07:00
Rory Mitchell	8eab966998	Allow unique prediction vector for each input matrix (#4275 )	2019-03-21 11:38:16 +13:00
Jiaming Yuan	09bd9e68cf	Use Monitor in quantile hist. (#4273 )	2019-03-20 09:26:22 +08:00
Rory Mitchell	00465d243d	Optimisations for gpu_hist. (#4248 ) * Optimisations for gpu_hist. * Use streams to overlap operations. * ColumnSampler now uses HostDeviceVector to prevent repeatedly copying feature vectors to the device.	2019-03-20 13:30:06 +13:00
Jiaming Yuan	29a1356669	Deprecate `reg:linear' in favor of` reg:squarederror'. (#4267 ) * Deprecate `reg:linear' in favor of `reg:squarederror'. * Replace the use of `reg:linear'. * Replace the use of `silent`.	2019-03-17 17:55:04 +08:00
Jiaming Yuan	cf8d5b9b76	Mark CUDA 10.1 as unsupported. (#4265 )	2019-03-17 16:59:15 +08:00
Jiaming Yuan	fdcae024e7	Remove deprecated C APIs. (#4266 )	2019-03-17 16:42:44 +08:00
Rory Mitchell	5465b73e7c	Fix multi-GPU test failures (#4259 )	2019-03-15 14:40:43 +13:00
Andy Adinets	b833b642ec	Improved multi-node multi-GPU random forests. (#4238 ) * Improved multi-node multi-GPU random forests. - removed rabit::Broadcast() from each invocation of column sampling - instead, syncing the PRNG seed when a ColumnSampler() object is constructed - this makes non-trivial column sampling significantly faster in the distributed case - refactored distributed GPU tests - added distributed random forests tests	2019-03-13 12:36:28 +13:00
Jiaming Yuan	7b9043cf71	Fix clang-tidy warnings. (#4149 ) * Upgrade gtest for clang-tidy. * Use CMake to install GTest instead of mv. * Don't enforce clang-tidy to return 0 due to errors in thrust. * Add a small test for tidy itself. * Reformat.	2019-03-13 02:25:51 +08:00
Tong He	259fb809e9	fix R-devel errors (#4251 )	2019-03-12 10:06:54 -07:00
Rory Mitchell	4eeeded7d1	Remove various synchronisations from cuda API calls, instrument monitor (#4205 ) * Remove various synchronisations from cuda API calls, instrument monitor with nvtx profiler ranges.	2019-03-10 15:01:23 +13:00
Philip Hyunsu Cho	f83e62dca5	Address #4042 : Prevent out-of-range access in column matrix (#4231 )	2019-03-08 17:11:42 -08:00
Rong Ou	9837b09b20	support cuda 10.1 (#4223 ) * support cuda 10.1 * add cuda 10.1 to jenkins build matrix	2019-03-08 12:22:12 +13:00
Rong Ou	0944360416	minor fix: log InitDataOnce() only when it is actually called (#4206 )	2019-03-08 10:53:09 +13:00
Matthew Jones	92b7577c62	[REVIEW] Enable Multi-Node Multi-GPU functionality (#4095 ) * Initial commit to support multi-node multi-gpu xgboost using dask * Fixed NCCL initialization by not ignoring the opg parameter. - it now crashes on NCCL initialization, but at least we're attempting it properly * At the root node, perform a rabit::Allreduce to get initial sum_gradient across workers * Synchronizing in a couple of more places. - now the workers don't go down, but just hang - no more "wild" values of gradients - probably needs syncing in more places * Added another missing max-allreduce operation inside BuildHistLeftRight * Removed unnecessary collective operations. * Simplified rabit::Allreduce() sync of gradient sums. * Removed unnecessary rabit syncs around ncclAllReduce. - this improves performance _significantly_ (7x faster for overall training, 20x faster for xgboost proper) * pulling in latest xgboost * removing changes to updater_quantile_hist.cc * changing use_nccl_opg initialization, removing unnecessary if statements * added definition for opaque ncclUniqueId struct to properly encapsulate GetUniqueId * placing struct defintion in guard to avoid duplicate code errors * addressing linting errors * removing * removing additional arguments to AllReduer initialization * removing distributed flag * making comm init symmetric * removing distributed flag * changing ncclCommInit to support multiple modalities * fix indenting * updating ncclCommInitRank block with necessary group calls * fix indenting * adding print statement, and updating accessor in vector * improving print statement to end-line * generalizing nccl_rank construction using rabit * assume device_ordinals is the same for every node * test, assume device_ordinals is identical for all nodes * test, assume device_ordinals is unique for all nodes * changing names of offset variable to be more descriptive, editing indenting * wrapping ncclUniqueId GetUniqueId() and aesthetic changes * adding synchronization, and tests for distributed * adding to tests * fixing broken #endif * fixing initialization of gpu histograms, correcting errors in tests * adding to contributors list * adding distributed tests to jenkins * fixing bad path in distributed test * debugging * adding kubernetes for distributed tests * adding proper import for OrderedDict * adding urllib3==1.22 to address ordered_dict import error * added sleep to allow workers to save their models for comparison * adding name to GPU contributors under docs	2019-03-02 10:03:22 +13:00
Jiaming Yuan	7ea5675679	Add PushCSC for SparsePage. (#4193 ) * Add PushCSC for SparsePage. * Move Push* definitions into cc file. * Add std:: prefix to `size_t` make clang++ happy. * Address monitor count == 0.	2019-03-02 01:58:08 +08:00
Philip Hyunsu Cho	2aaae2e7bb	Fix #4163 : always copy sliced data (#4165 ) * Revert "Accept numpy array view. (#4147)" This reverts commit `a985a99cf0`. * Fix #4163: always copy sliced data * Remove print() from the test; check shape equality * Check if 'base' attribute exists * Fix lint * Address reviewer comment * Fix lint	2019-02-20 14:46:34 -08:00

... 5 6 7 8 9 ...

972 Commits