79 Commits

Author SHA1 Message Date
Jiaming Yuan
ee287808fb
Lazy initialization of device vector. (#5173)
* Lazy initialization of device vector.

* Fix #5162.

* Disable copy constructor of HostDeviceVector.  Prevents implicit copying.

* Fix CPU build.

* Bring back move assignment operator.
2020-01-07 11:23:05 +08:00
Jiaming Yuan
ebc86a3afa
Disable parameter validation for Scikit-Learn interface. (#5167)
* Disable parameter validation for now.

Scikit-Learn passes all parameters down to XGBoost, whether they are used or
not.

* Add option `validate_parameters`.
2020-01-07 11:17:31 +08:00
Egor Smirnov
7b17e76c5b Optimized EvaluateSplut function (#5138)
* Add block based threading utilities.
2019-12-31 18:18:42 +08:00
sriramch
ee81ba8e1f implementation of map ranking algorithm on gpu (#5129)
* - implementation of map ranking algorithm
  - also effected necessary suggestions mentioned in the earlier ranking pr's
  - made some performance improvements to the ndcg algo as well
2019-12-27 12:05:37 +13:00
Jiaming Yuan
e089e16e3d
Pass pointer to model parameters. (#5101)
* Pass pointer to model parameters.

This PR de-duplicates most of the model parameters except the one in
`tree_model.h`.  One difficulty is `base_score` is a model property but can be
changed at runtime by objective function.  Hence when performing model IO, we
need to save the one provided by users, instead of the one transformed by
objective.  Here we created an immutable version of `LearnerModelParam` that
represents the value of model parameter after configuration.
2019-12-10 12:11:22 +08:00
Rory Mitchell
979f74d51a
Group builder modified for incremental building (#5098) 2019-12-10 14:33:56 +13:00
Jiaming Yuan
2dcb62ddfb
Add IO utilities. (#5091)
* Add fixed size stream for reading model stream.
* Add file extension.
2019-12-05 22:15:34 +08:00
Jiaming Yuan
f0ca53d9ec
Convenient methods for JSON integer. (#5089)
* Fix parsing empty object.
2019-12-05 11:01:12 +08:00
Rory Mitchell
e3c34c79be
External data adapters (#5044)
* Use external data adapters as lightweight intermediate layer between external data and DMatrix
2019-12-04 10:56:17 +13:00
Jiaming Yuan
97abcc7ee2
Extract interaction constraint from split evaluator. (#5034)
*  Extract interaction constraints from split evaluator.

The reason for doing so is mostly for model IO, where num_feature and interaction_constraints are copied in split evaluator. Also interaction constraint by itself is a feature selector, acting like column sampler and it's inefficient to bury it deep in the evaluator chain. Lastly removing one another copied parameter is a win.

*  Enable inc for approx tree method.

As now the implementation is spited up from evaluator class, it's also enabled for approx method.

*  Removing obsoleted code in colmaker.

They are never documented nor actually used in real world. Also there isn't a single test for those code blocks.

*  Unifying the types used for row and column.

As the size of input dataset is marching to billion, incorrect use of int is subject to overflow, also singed integer overflow is undefined behaviour. This PR starts the procedure for unifying used index type to unsigned integers. There's optimization that can utilize this undefined behaviour, but after some testings I don't see the optimization is beneficial to XGBoost.
2019-11-14 20:11:41 +08:00
Jiaming Yuan
ac457c56a2
Use `UpdateAllowUnknown' for non-model related parameter. (#4961)
* Use `UpdateAllowUnknown' for non-model related parameter.

Model parameter can not pack an additional boolean value due to binary IO
format.  This commit deals only with non-model related parameter configuration.

* Add tidy command line arg for use-dmlc-gtest.
2019-10-23 05:50:12 -04:00
Jiaming Yuan
f24be2efb4 Use configure_file() to configure version only (#4974)
* Avoid writing build_config.h

* Remove build_config.h all together.

* Lint.
2019-10-22 23:47:00 -07:00
Jiaming Yuan
5620322a48
[Breaking] Add global versioning. (#4936)
* Use CMake config file for representing version.

* Generate c and Python version file with CMake.

The generated file is written into source tree.  But unless XGBoost upgrades
its version, there will be no actual modification.  This retains compatibility
with Makefiles for R.

* Add XGBoost version the DMatrix binaries.
* Simplify prefetch detection in CMakeLists.txt
2019-10-22 23:27:26 -04:00
Jiaming Yuan
4771bb0d41
Catch exception in transform function omp context. (#4960) 2019-10-21 17:03:38 +08:00
Jiaming Yuan
31030a8d3a
Set correct file permission. (#4964) 2019-10-18 12:54:29 -04:00
Jiaming Yuan
ae536756ae
Add Model and Configurable interface. (#4945)
* Apply Configurable to objective functions.
* Apply Model to Learner and Regtree, gbm.
* Add Load/SaveConfig to objs.
* Refactor obj tests to use smart pointer.
* Dummy methods for Save/Load Model.
2019-10-18 01:56:02 -04:00
Rory Mitchell
60748b2071
Use heuristic to select histogram node, avoid rabit call (#4951) 2019-10-18 11:33:54 +13:00
Jiaming Yuan
b61d534472
Span: use size_t' for index_type, add front' and `back'. (#4935)
* Use `size_t' for index_type.  Add `front' and `back'.

* Remove a batch of `static_cast'.
2019-10-14 09:13:33 -04:00
Jiaming Yuan
3d46bd0fa5
Ignore columnar alignment requirement. (#4928)
* Better error message for wrong type.
* Fix stride size.
2019-10-13 06:41:43 -04:00
Oleksandr Pryimak
80977182c5 Use bundled gtest (#4900)
* Suggest to use gtest bundled with dmlc

* Use dmlc bundled gtest in all CI scripts

* Make clang-tidy to use dmlc embedded gtest
2019-10-09 16:26:19 -07:00
Jiaming Yuan
095de3bf5f
Export c++ headers in CMake installation. (#4897)
* Move get transpose into cc.

* Clean up headers in host device vector, remove thrust dependency.

* Move span and host device vector into public.

* Install c++ headers.

* Short notes for c and c++.

Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-10-06 23:53:09 -04:00
Jiaming Yuan
57106a3459
Fix parsing empty json object. (#4868)
* Fix parsing empty json object.

* Better error message.
2019-09-18 03:31:46 -04:00
Rong Ou
125bcec62e Move ellpack page construction into DMatrix (#4833) 2019-09-16 23:50:55 -04:00
Rong Ou
733ed24dd9 further cleanup of single process multi-GPU code (#4810)
* use subspan in gpu predictor instead of copying
* Revise `HostDeviceVector`
2019-08-30 05:27:23 -04:00
Rong Ou
38ab79f889 Make HostDeviceVector single gpu only (#4773)
* Make HostDeviceVector single gpu only
2019-08-26 09:51:13 +12:00
Jiaming Yuan
9700776597 Cudf support. (#4745)
* Initial support for cudf integration.

* Add two C APIs for consuming data and metainfo.

* Add CopyFrom for SimpleCSRSource as a generic function to consume the data.

* Add FromDeviceColumnar for consuming device data.

* Add new MetaInfo::SetInfo for consuming label, weight etc.
2019-08-19 16:51:40 +12:00
Rong Ou
c5b229632d [BREAKING] prevent multi-gpu usage (#4749)
* prevent multi-gpu usage

* fix distributed test

* combine gpu predictor tests

* set upper bound on n_gpus
2019-08-13 09:11:35 +12:00
Jiaming Yuan
2a4df8e29f
Add Json integer, remove specialization. (#4739) 2019-08-06 03:10:49 -04:00
Jiaming Yuan
9c469b3844
Move bitfield into common. (#4737)
* Prepare for columnar format support.
2019-08-06 02:49:32 -04:00
Rong Ou
6edddd7966 Refactor DMatrix to return batches of different page types (#4686)
* Use explicit template parameter for specifying page type.
2019-08-03 15:10:34 -04:00
Jiaming Yuan
d2e1e4d5b4
A simple Json implementation for future use. (#4708)
* A simple Json implementation for future use.
2019-07-29 21:17:27 -04:00
Jiaming Yuan
f0064c07ab
Refactor configuration [Part II]. (#4577)
* Refactor configuration [Part II].

* General changes:
** Remove `Init` methods to avoid ambiguity.
** Remove `Configure(std::map<>)` to avoid redundant copying and prepare for
   parameter validation. (`std::vector` is returned from `InitAllowUnknown`).
** Add name to tree updaters for easier debugging.

* Learner changes:
** Make `LearnerImpl` the only source of configuration.

    All configurations are stored and carried out by `LearnerImpl::Configure()`.

** Remove booster in C API.

    Originally kept for "compatibility reason", but did not state why.  So here
    we just remove it.

** Add a `metric_names_` field in `LearnerImpl`.
** Remove `LazyInit`.  Configuration will always be lazy.
** Run `Configure` before every iteration.

* Predictor changes:
** Allocate both cpu and gpu predictor.
** Remove cpu_predictor from gpu_predictor.

    `GBTree` is now used to dispatch the predictor.

** Remove some GPU Predictor tests.

* IO

No IO changes.  The binary model format stability is tested by comparing
hashing value of save models between two commits
2019-07-20 08:34:56 -04:00
Jiaming Yuan
d9a47794a5 Fix CPU hist init for sparse dataset. (#4625)
* Fix CPU hist init for sparse dataset.

* Implement sparse histogram cut.
* Allow empty features.

* Fix windows build, don't use sparse in distributed environment.

* Comments.

* Smaller threshold.

* Fix windows omp.

* Fix msvc lambda capture.

* Fix MSVC macro.

* Fix MSVC initialization list.

* Fix MSVC initialization list x2.

* Preserve categorical feature behavior.

* Rename matrix to sparse cuts.
* Reuse UseGroup.
* Check for categorical data when adding cut.

Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>

* Sanity check.

* Fix comments.

* Fix comment.
2019-07-04 16:27:03 -07:00
Jiaming Yuan
45876bf41b
Fix external memory for get column batches. (#4622)
* Fix external memory for get column batches.

This fixes two bugs:

* Use PushCSC for get column batches.
* Don't remove the created temporary directory before finishing test.

* Check all pages.
2019-06-30 09:56:49 +08:00
Rory Mitchell
221e163185
Refactor out row partitioning logic from gpu_hist, introduce caching device vectors (#4554) 2019-06-20 18:24:09 +12:00
sriramch
90f683b25b Set the appropriate device before freeing device memory... (#4566)
* - set the appropriate device before freeing device memory...
   - pr #4532 added a global memory tracker/logger to keep track of number of (de)allocations
     and peak memory usage on a per device basis.
   - this pr adds the appropriate check to make sure that the (de)allocation counts and memory usages
     makes sense for the device. since verbosity is typically increased on debug/non-retail builds.  
* - pre-create cub allocators and reuse them
   - create them once and not resize them dynamically. we need to ensure that these allocators
     are created and destroyed exactly once so that the appropriate device id's are set
2019-06-18 14:58:05 +12:00
Philip Hyunsu Cho
3f2fe25a32 Fix C++11 config parser (#4521)
* Fix C++11 config parser
* Use raw strings to improve readability of regex
* Fix compilation for GCC 5.x

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2019-06-03 22:18:16 +08:00
Rory Mitchell
fbbae3386a
Smarter choice of histogram construction for distributed gpu_hist (#4519)
* Smarter choice of histogram construction for distributed gpu_hist

* Limit omp team size in ExecuteShards
2019-05-31 14:11:34 +12:00
sriramch
fed665ae8a - training with external memory part 1 of 2 (#4486)
* - training with external memory part 1 of 2
   - this pr focuses on computing the quantiles using multiple gpus on a
     dataset that uses the external cache capabilities
   - there will a follow-up pr soon after this that will support creation
     of histogram indices on large dataset as well
   - both of these changes are required to support training with external memory
   - the sparse pages in dmatrix are taken in batches and the the cut matrices
     are incrementally built
   - also snuck in some (perf) changes related to sketches aggregation amongst multiple
     features across multiple sparse page batches. instead of aggregating the summary
     inside each device and merged later, it is aggregated in-place when the device
     is working on different rows but the same feature
2019-05-30 08:18:34 +12:00
Jiaming Yuan
c589eff941
De-duplicate GPU parameters. (#4454)
* Only define `gpu_id` and `n_gpus` in `LearnerTrainParam`
* Pass LearnerTrainParam through XGBoost vid factory method.
* Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively).
* Test learner train param io.
* Fix gpu pickling.
2019-05-29 11:55:57 +08:00
sriramch
a3fedbeaa8 - fix issues with training with external memory on cpu (#4487)
* - fix issues with training with external memory on cpu
   - use the batch size to determine the correct number of rows in a batch
   - use the right number of threads in omp parallalization if the batch size
     is less than the default omp max threads (applicable for the last batch)

* - handle scenarios where last batch size is < available number of threads
- augment tests such that we can test all scenarios (batch size <, >, = number of threads)
2019-05-29 12:31:30 +12:00
Rong Ou
eaab364a63 More explict sharding methods for device memory (#4396)
* Rename the Reshard method to Shard

* Add a new Reshard method for sharding a vector that's already sharded
2019-05-01 11:47:22 +12:00
Rory Mitchell
5e582b0fa7
Combine thread launches into single launch per tree for gpu_hist (#4343)
* Combine thread launches into single launch per tree for gpu_hist
algorithm.

* Address deprecation warning

* Add manual column sampler constructor

* Turn off omp dynamic to get a guaranteed number of threads

* Enable openmp in cuda code
2019-04-29 09:58:34 +12:00
Jiaming Yuan
207f058711 Refactor CMake scripts. (#4323)
* Refactor CMake scripts.

* Remove CMake CUDA wrapper.
* Bump CMake version for CUDA.
* Use CMake to handle Doxygen.
* Split up CMakeList.
* Export install target.
* Use modern CMake.
* Remove build.sh
* Workaround for gpu_hist test.
* Use cmake 3.12.

* Revert machine.conf.

* Move CLI test to gpu.

* Small cleanup.

* Support using XGBoost as submodule.

* Fix windows

* Fix cpp tests on Windows

* Remove duplicated find_package.
2019-04-15 10:08:12 -07:00
Rory Mitchell
3f312e30db
Retire DVec class in favour of c++20 style span for device memory. (#4293) 2019-03-28 13:59:58 +13:00
Rory Mitchell
00465d243d
Optimisations for gpu_hist. (#4248)
* Optimisations for gpu_hist.

* Use streams to overlap operations.

* ColumnSampler now uses HostDeviceVector to prevent repeatedly copying feature vectors to the device.
2019-03-20 13:30:06 +13:00
Jiaming Yuan
7b9043cf71
Fix clang-tidy warnings. (#4149)
* Upgrade gtest for clang-tidy.
* Use CMake to install GTest instead of mv.
* Don't enforce clang-tidy to return 0 due to errors in thrust.
* Add a small test for tidy itself.

* Reformat.
2019-03-13 02:25:51 +08:00
Jiaming Yuan
1fe874e58a
Fix empty subspan. (#4151)
* Silent the death tests.
2019-02-17 04:48:03 +08:00
Jiaming Yuan
754fe8142b
Make `HistCutMatrix::Init' be aware of groups. (#4115)
* Add checks for group size.
* Simple docs.
* Search group index during hist cut matrix initialization.

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-02-16 04:39:41 +08:00
Andy Adinets
0f8af85f64 Fixed single-GPU tests. (#4053)
- ./testxgboost (without filters) failed if run on a multi-GPU machine because
  the memory was allocated on the current device, but device 0
  was always passed into LaunchN
2019-01-11 09:33:15 +02:00