This is part 1 of refactoring configuration.
* Move tree heuristic configurations.
* Split up declarations and definitions for GBTree.
* Implement UseGPU in gbm.
* Only define `gpu_id` and `n_gpus` in `LearnerTrainParam`
* Pass LearnerTrainParam through XGBoost vid factory method.
* Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively).
* Test learner train param io.
* Fix gpu pickling.
* adding support for matrix slicing with query ID for cross-validation
* hail mary test of unrar installation for windows tests
* trying to modify tests to run in Github CI
* Remove dependency on wget and unrar
* Save error log from R test
* Relax assertion in test_training
* Use int instead of bool in C function interface
* Revise R interface
* Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility
* Combine thread launches into single launch per tree for gpu_hist
algorithm.
* Address deprecation warning
* Add manual column sampler constructor
* Turn off omp dynamic to get a guaranteed number of threads
* Enable openmp in cuda code
* Refactor CMake scripts.
* Remove CMake CUDA wrapper.
* Bump CMake version for CUDA.
* Use CMake to handle Doxygen.
* Split up CMakeList.
* Export install target.
* Use modern CMake.
* Remove build.sh
* Workaround for gpu_hist test.
* Use cmake 3.12.
* Revert machine.conf.
* Move CLI test to gpu.
* Small cleanup.
* Support using XGBoost as submodule.
* Fix windows
* Fix cpp tests on Windows
* Remove duplicated find_package.
* Initial performance optimizations for xgboost
* remove includes
* revert float->double
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* Check existence of _mm_prefetch and __builtin_prefetch
* Fix lint
* Unify logging facilities.
* Enhance `ConsoleLogger` to handle different verbosity.
* Override macros from `dmlc`.
* Don't use specialized gamma when building with GPU.
* Remove verbosity cache in monitor.
* Test monitor.
* Deprecate `silent`.
* Fix doc and messages.
* Fix python test.
* Fix silent tests.
* Port elementwise metrics to GPU.
* All elementwise metrics are converted to static polymorphic.
* Create a reducer for metrics reduction.
* Remove const of Metric::Eval to accommodate CubMemory.
* Fix#3342 and h2oai/h2o4gpu#625: Save predictor parameters in model file
This allows pickled models to retain predictor attributes, such as
'predictor' (whether to use CPU or GPU) and 'n_gpu' (number of GPUs
to use). Related: h2oai/h2o4gpu#625Closes#3342.
TODO. Write a test.
* Fix lint
* Do not load GPU predictor into CPU-only XGBoost
* Add a test for pickling GPU predictors
* Make sample data big enough to pass multi GPU test
* Update test_gpu_predictor.cu
* Split building histogram into separated class.
* Extract `InitCompressedRow` definition.
* Basic tests for gpu-hist.
* Document the code more verbosely.
* Removed `HistCutUnit`.
* Removed some duplicated copies in `GPUHistMaker`.
* Implement LCG and use it in tests.
* DMatrix refactor 2
* Remove buffered rowset usage where possible
* Transition to c++11 style iterators for row access
* Transition column iterators to C++ 11
* Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage.
- added distributions to HostDeviceVector
- using HostDeviceVector for labels, weights and base margings in MetaInfo
- using HostDeviceVector for offset and data in SparsePage
- other necessary refactoring
* Added const version of HostDeviceVector API calls.
- const versions added to calls that can trigger data transfers, e.g. DevicePointer()
- updated the code that uses HostDeviceVector
- objective functions now accept const HostDeviceVector<bst_float>& for predictions
* Updated src/linear/updater_gpu_coordinate.cu.
* Added read-only state for HostDeviceVector sync.
- this means no copies are performed if both host and devices access
the HostDeviceVector read-only
* Fixed linter and test errors.
- updated the lz4 plugin
- added ConstDeviceSpan to HostDeviceVector
- using device % dh::NVisibleDevices() for the physical device number,
e.g. in calls to cudaSetDevice()
* Fixed explicit template instantiation errors for HostDeviceVector.
- replaced HostDeviceVector<unsigned int> with HostDeviceVector<int>
* Fixed HostDeviceVector tests that require multiple GPUs.
- added a mock set device handler; when set, it is called instead of cudaSetDevice()
* Add basic Span class based on ISO++20.
* Use Span<Entry const> instead of Inst in SparsePage.
* Add DeviceSpan in HostDeviceVector, use it in regression obj.
* Revert "Fix #3485, #3540: Don't use dropout for predicting test sets (#3556)"
This reverts commit 44811f233071c5805d70c287abd22b155b732727.
* Document behavior of predict() for DART booster
* Add notice to parameter.rst
* add qid for https://github.com/dmlc/xgboost/issues/2748
* change names
* change spaces
* change qid to bst_uint type
* change qid type to size_t
* change qid first to SIZE_MAX
* change qid type from size_t to uint64_t
* update dmlc-core
* fix qids name error
* fix group_ptr_ error
* Style fix
* Add qid handling logic to SparsePage
* New MetaInfo format + backward compatibility fix
Old MetaInfo format (1.0) doesn't contain qid field. We still want to be able
to read from MetaInfo files saved in old format. Also, define a new format
(2.0) that contains the qid field. This way, we can distinguish files that
contain qid and those that do not.
* Update MetaInfo test
* Simply group assignment logic
* Explicitly set qid=nullptr in NativeDataIter
NativeDataIter's callback does not support qid field. Users of NativeDataIter
will need to call setGroup() function separately to set group information.
* Save qids_ in SaveBinary()
* Upgrade dmlc-core submodule
* Add a test for reading qid
* Add contributor
* Check the size of qids_
* Document qid format
* Use sparse page as singular CSR matrix representation
* Simplify dmatrix methods
* Reduce statefullness of batch iterators
* BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.
* GPU binning and compression.
- binning and index compression are done inside the DeviceShard constructor
- in case of a DMatrix with multiple row batches, it is first converted into a single row batch
* Replaced std::vector-based interfaces with HostDeviceVector-based interfaces.
- replacement was performed in the learner, boosters, predictors,
updaters, and objective functions
- only interfaces used in training were replaced;
interfaces like PredictInstance() still use std::vector
- refactoring necessary for replacement of interfaces was also performed,
such as using HostDeviceVector in prediction cache
* HostDeviceVector-based interfaces for custom objective function example plugin.
* Add interaction effects and cox loss
* Minimize whitespace changes
* Cox loss now no longer needs a pre-sorted dataset.
* Address code review comments
* Remove mem check, rename to pred_interactions, include bias
* Make lint happy
* More lint fixes
* Fix cox loss indexing
* Fix main effects and tests
* Fix lint
* Use half interaction values on the off-diagonals
* Fix lint again
* Added GPU objective function and no-copy interface.
- xgboost::HostDeviceVector<T> syncs automatically between host and device
- no-copy interfaces have been added
- default implementations just sync the data to host
and call the implementations with std::vector
- GPU objective function, predictor, histogram updater process data
directly on GPU
* [R] fix predict contributions for data with no colnames
* [R] add a render parameter for xgb.plot.multi.trees; fixes#2628
* [R] update Rd's
* [R] remove unnecessary dep-package from R cmake install
* silence type warnings; readability
* [R] silence complaint about incomplete line at the end
* [R] initial version of xgb.plot.shap()
* [R] more work on xgb.plot.shap
* [R] enforce black font in xgb.plot.tree; fixes#2640
* [R] if feature names are available, check in predict that they are the same; fixes#2857
* [R] cran check and lint fixes
* remove tabs
* [R] add references; a test for plot.shap