This is part 1 of refactoring configuration.
* Move tree heuristic configurations.
* Split up declarations and definitions for GBTree.
* Implement UseGPU in gbm.
* - training with external memory - part 2 of 2
- when external memory support is enabled, building of histogram indices are
done incrementally for every sparse page
- the entire set of input data is divided across multiple gpu's and the relative
row positions within each device is tracked when building the compressed histogram buffer
- this was tested using a mortgage dataset containing ~ 670m rows before 4xt4's could be
saturated
* Fix C++11 config parser
* Use raw strings to improve readability of regex
* Fix compilation for GCC 5.x
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
* simplify the config.h file
* revise config.h
* revised config.h
* revise format
* revise format issues
* revise whitespace issues
* revise whitespace namespace format issues
* revise namespace format issues
* format issues
* format issues
* format issues
* format issues
* Revert submodule changes
* minor change
* Update src/common/config.h
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* address format issue from trivialfis
* Use correct cub submodule
* - training with external memory part 1 of 2
- this pr focuses on computing the quantiles using multiple gpus on a
dataset that uses the external cache capabilities
- there will a follow-up pr soon after this that will support creation
of histogram indices on large dataset as well
- both of these changes are required to support training with external memory
- the sparse pages in dmatrix are taken in batches and the the cut matrices
are incrementally built
- also snuck in some (perf) changes related to sketches aggregation amongst multiple
features across multiple sparse page batches. instead of aggregating the summary
inside each device and merged later, it is aggregated in-place when the device
is working on different rows but the same feature
* Only define `gpu_id` and `n_gpus` in `LearnerTrainParam`
* Pass LearnerTrainParam through XGBoost vid factory method.
* Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively).
* Test learner train param io.
* Fix gpu pickling.
* - fix issues with training with external memory on cpu
- use the batch size to determine the correct number of rows in a batch
- use the right number of threads in omp parallalization if the batch size
is less than the default omp max threads (applicable for the last batch)
* - handle scenarios where last batch size is < available number of threads
- augment tests such that we can test all scenarios (batch size <, >, = number of threads)
* adding support for matrix slicing with query ID for cross-validation
* hail mary test of unrar installation for windows tests
* trying to modify tests to run in Github CI
* Remove dependency on wget and unrar
* Save error log from R test
* Relax assertion in test_training
* Use int instead of bool in C function interface
* Revise R interface
* Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility
* Add CMake option to use bundled gtest from dmlc-core, so that it is easy to build XGBoost with gtest on Windows
* Consistently apply OpenMP flag to all targets. Force enable OpenMP when USE_CUDA is turned on.
* Insert vcomp140.dll into Windows wheels
* Add C++ and Python tests for CPU and GPU targets (CUDA 9.0, 10.0, 10.1)
* Prevent spurious msbuild failure
* Add GPU tests
* Upgrade dmlc-core
* Fix#4462: Use /MT flag consistently for MSVC target
* First attempt at Windows CI
* Distinguish stages in Linux and Windows pipelines
* Try running CMake in Windows pipeline
* Add build step
* Automatically set maximize_evaluation_metrics if not explicitly given.
* When custom_eval is set, require maximize_evaluation_metrics.
* Update documents on early stop in XGBoost4J-Spark.
* Fix code error.