27 Commits

Author SHA1 Message Date
Jiaming Yuan
095de3bf5f
Export c++ headers in CMake installation. (#4897)
* Move get transpose into cc.

* Clean up headers in host device vector, remove thrust dependency.

* Move span and host device vector into public.

* Install c++ headers.

* Short notes for c and c++.

Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-10-06 23:53:09 -04:00
Rong Ou
6edddd7966 Refactor DMatrix to return batches of different page types (#4686)
* Use explicit template parameter for specifying page type.
2019-08-03 15:10:34 -04:00
Jiaming Yuan
45876bf41b
Fix external memory for get column batches. (#4622)
* Fix external memory for get column batches.

This fixes two bugs:

* Use PushCSC for get column batches.
* Don't remove the created temporary directory before finishing test.

* Check all pages.
2019-06-30 09:56:49 +08:00
sriramch
a2042b685a - training with external memory - part 2 of 2 (#4526)
* - training with external memory - part 2 of 2
   - when external memory support is enabled, building of histogram indices are
     done incrementally for every sparse page
   - the entire set of input data is divided across multiple gpu's and the relative
     row positions within each device is tracked when building the compressed histogram buffer
   - this was tested using a mortgage dataset containing ~ 670m rows before 4xt4's could be
     saturated
2019-06-12 09:52:56 +12:00
Rong Ou
feb6ae3e18 Initial support for external memory in gpu_predictor (#4284) 2019-05-03 13:01:27 +12:00
Rong Ou
f4521bf6aa refactor tests to get rid of duplication (#4358)
* refactor tests to get rid of duplication

* address review comments
2019-04-12 00:21:48 -07:00
Jiaming Yuan
7b9043cf71
Fix clang-tidy warnings. (#4149)
* Upgrade gtest for clang-tidy.
* Use CMake to install GTest instead of mv.
* Don't enforce clang-tidy to return 0 due to errors in thrust.
* Add a small test for tidy itself.

* Reformat.
2019-03-13 02:25:51 +08:00
Jiaming Yuan
48dddfd635
Porting elementwise metrics to GPU. (#3952)
* Port elementwise metrics to GPU.

* All elementwise metrics are converted to static polymorphic.
* Create a reducer for metrics reduction.
* Remove const of Metric::Eval to accommodate CubMemory.
2018-12-01 18:46:45 +13:00
Philip Hyunsu Cho
2b045aa805
Make C++ unit tests run and pass on Windows (#3869)
* Make C++ unit tests run and pass on Windows

* Fix logic for external memory. The letter ':' is part of drive letter,
so remove the drive letter before splitting on ':'.
* Cosmetic syntax changes to keep MSVC happy.

* Fix lint

* Add Windows guard
2018-11-06 17:17:24 -08:00
Philip Hyunsu Cho
abf2f661be
Fix #3708: Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way (#3783)
* Fix #3708: Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way

Also install git inside NVIDIA GPU container

* Update dmlc-core
2018-10-18 10:16:04 -07:00
trivialfis
516457fadc Add basic unittests for gpu-hist method. (#3785)
* Split building histogram into separated class.
* Extract `InitCompressedRow` definition.
* Basic tests for gpu-hist.
* Document the code more verbosely.
* Removed `HistCutUnit`.
* Removed some duplicated copies in `GPUHistMaker`.
* Implement LCG and use it in tests.
2018-10-15 15:47:00 +13:00
trivialfis
d594b11f35 Implement transform to reduce CPU/GPU code duplication. (#3643)
* Implement Transform class.
* Add tests for softmax.
* Use Transform in regression, softmax and hinge objectives, except for Cox.
* Mark old gpu objective functions deprecated.
* static_assert for softmax.
* Split up multi-gpu tests.
2018-10-02 15:06:21 +13:00
Andy Adinets
72cd1517d6 Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. (#3446)
* Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage.

- added distributions to HostDeviceVector
- using HostDeviceVector for labels, weights and base margings in MetaInfo
- using HostDeviceVector for offset and data in SparsePage
- other necessary refactoring

* Added const version of HostDeviceVector API calls.

- const versions added to calls that can trigger data transfers, e.g. DevicePointer()
- updated the code that uses HostDeviceVector
- objective functions now accept const HostDeviceVector<bst_float>& for predictions

* Updated src/linear/updater_gpu_coordinate.cu.

* Added read-only state for HostDeviceVector sync.

- this means no copies are performed if both host and devices access
  the HostDeviceVector read-only

* Fixed linter and test errors.

- updated the lz4 plugin
- added ConstDeviceSpan to HostDeviceVector
- using device % dh::NVisibleDevices() for the physical device number,
  e.g. in calls to cudaSetDevice()

* Fixed explicit template instantiation errors for HostDeviceVector.

- replaced HostDeviceVector<unsigned int> with HostDeviceVector<int>

* Fixed HostDeviceVector tests that require multiple GPUs.

- added a mock set device handler; when set, it is called instead of cudaSetDevice()
2018-08-30 14:28:47 +12:00
trivialfis
cf2d86a4f6 Add travis sanitizers tests. (#3557)
* Add travis sanitizers tests.

* Add gcc-7 in Travis.
* Add SANITIZER_PATH for CMake.
* Enable sanitizer tests in Travis.

* Fix memory leaks in tests.

* Fix all memory leaks reported by Address Sanitizer.
* tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.
2018-08-19 16:40:30 +12:00
ngoyal2707
5cd851ccef added code for instance based weighing for rank objectives (#3379)
* added code for instance based weighing for rank objectives

* Fix lint
2018-06-22 15:10:59 -07:00
Rory Mitchell
a96039141a
Dmatrix refactor stage 1 (#3301)
* Use sparse page as singular CSR matrix representation

* Simplify dmatrix methods

* Reduce statefullness of batch iterators

* BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.
2018-06-07 10:25:58 +12:00
Andy Adinets
286dccb8e8 GPU binning and compression. (#3319)
* GPU binning and compression.

- binning and index compression are done inside the DeviceShard constructor
- in case of a DMatrix with multiple row batches, it is first converted into a single row batch
2018-06-05 17:15:13 +12:00
Rory Mitchell
ccf80703ef
Clang-tidy static analysis (#3222)
* Clang-tidy static analysis

* Modernise checks

* Google coding standard checks

* Identifier renaming according to Google style
2018-04-19 18:57:13 +12:00
Andrew V. Adinetz
d5992dd881 Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. (#3116)
* Replaced std::vector-based interfaces with HostDeviceVector-based interfaces.

- replacement was performed in the learner, boosters, predictors,
  updaters, and objective functions
- only interfaces used in training were replaced;
  interfaces like PredictInstance() still use std::vector
- refactoring necessary for replacement of interfaces was also performed,
  such as using HostDeviceVector in prediction cache

* HostDeviceVector-based interfaces for custom objective function example plugin.
2018-02-28 13:00:04 +13:00
Rory Mitchell
e6a9063344 Integer gradient summation for GPU histogram algorithm. (#2681) 2017-09-08 15:07:29 +12:00
Rory Mitchell
ef23e424f1 [GPU-Plugin] Add GPU accelerated prediction (#2593)
* [GPU-Plugin] Add GPU accelerated prediction

* Improve allocation message

* Update documentation

* Resolve linker error for predictor

* Add unit tests
2017-08-16 12:31:59 +12:00
Thejaswi
85b2fb3eee [GPU-Plugin] Integration of a faster version of grow_gpu plugin into mainstream (#2360)
* Integrating a faster version of grow_gpu plugin
1. Removed the older files to reduce duplication
2. Moved all of the grow_gpu files under 'exact' folder
3. All of them are inside 'exact' namespace to avoid any conflicts
4. Fixed a bug in benchmark.py while running only 'grow_gpu' plugin
5. Added cub and googletest submodules to ease integration and unit-testing
6. Updates to CMakeLists.txt to directly build cuda objects into libxgboost

* Added support for building gpu plugins through make flow
1. updated makefile and config.mk to add right targets
2. added unit-tests for gpu exact plugin code

* 1. Added support for building gpu plugin using 'make' flow as well
2. Updated instructions for building and testing gpu plugin

* Fix travis-ci errors for PR#2360
1. lint errors on unit-tests
2. removed googletest, instead depended upon dmlc-core provide gtest cache

* Some more fixes to travis-ci lint failures PR#2360

* Added Rory's copyrights to the files containing code from both.

* updated copyright statement as per Rory's request

* moved the static datasets into a script to generate them at runtime

* 1. memory usage print when silent=0
2. tests/ and test/ folder organization
3. removal of the dependency of googletest for just building xgboost
4. coding style updates for .cuh as well

* Fixes for compilation warnings

* add cuda object files as well when JVM_BINDINGS=ON
2017-06-06 09:39:53 +12:00
AbdealiJK
03abd47f49 tests/cpp: Add tests for Metric RMSE 2016-12-04 11:25:57 -08:00
AbdealiJK
d41aab4f61 tests/cpp: Add tests for regression_obj.cc
Test the objective functions in regression_obj.cc

tests/cpp: Add tests for objective.cc and RegLossObj
2016-12-04 11:25:57 -08:00
AbdealiJK
d6407c3746 tests/cpp: Add tests for SparsePageDMatrix
The SparsePageDMatrix or external memory DMatrix reads data from the
file IO rather than load it into RAM.
2016-12-04 11:25:57 -08:00
AbdealiJK
be0f55d563 tests/cpp: Add tests for SimpleDMatrix 2016-12-04 11:25:57 -08:00
AbdealiJK
ef7fe06cf8 tests/cpp/test_metainfo: Add tests to save and load 2016-12-04 11:25:57 -08:00