* Unify logging facilities.
* Enhance `ConsoleLogger` to handle different verbosity.
* Override macros from `dmlc`.
* Don't use specialized gamma when building with GPU.
* Remove verbosity cache in monitor.
* Test monitor.
* Deprecate `silent`.
* Fix doc and messages.
* Fix python test.
* Fix silent tests.
* Make C++ unit tests run and pass on Windows
* Fix logic for external memory. The letter ':' is part of drive letter,
so remove the drive letter before splitting on ':'.
* Cosmetic syntax changes to keep MSVC happy.
* Fix lint
* Add Windows guard
* DMatrix refactor 2
* Remove buffered rowset usage where possible
* Transition to c++11 style iterators for row access
* Transition column iterators to C++ 11
* Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage.
- added distributions to HostDeviceVector
- using HostDeviceVector for labels, weights and base margings in MetaInfo
- using HostDeviceVector for offset and data in SparsePage
- other necessary refactoring
* Added const version of HostDeviceVector API calls.
- const versions added to calls that can trigger data transfers, e.g. DevicePointer()
- updated the code that uses HostDeviceVector
- objective functions now accept const HostDeviceVector<bst_float>& for predictions
* Updated src/linear/updater_gpu_coordinate.cu.
* Added read-only state for HostDeviceVector sync.
- this means no copies are performed if both host and devices access
the HostDeviceVector read-only
* Fixed linter and test errors.
- updated the lz4 plugin
- added ConstDeviceSpan to HostDeviceVector
- using device % dh::NVisibleDevices() for the physical device number,
e.g. in calls to cudaSetDevice()
* Fixed explicit template instantiation errors for HostDeviceVector.
- replaced HostDeviceVector<unsigned int> with HostDeviceVector<int>
* Fixed HostDeviceVector tests that require multiple GPUs.
- added a mock set device handler; when set, it is called instead of cudaSetDevice()
* Add basic Span class based on ISO++20.
* Use Span<Entry const> instead of Inst in SparsePage.
* Add DeviceSpan in HostDeviceVector, use it in regression obj.
* add qid for https://github.com/dmlc/xgboost/issues/2748
* change names
* change spaces
* change qid to bst_uint type
* change qid type to size_t
* change qid first to SIZE_MAX
* change qid type from size_t to uint64_t
* update dmlc-core
* fix qids name error
* fix group_ptr_ error
* Style fix
* Add qid handling logic to SparsePage
* New MetaInfo format + backward compatibility fix
Old MetaInfo format (1.0) doesn't contain qid field. We still want to be able
to read from MetaInfo files saved in old format. Also, define a new format
(2.0) that contains the qid field. This way, we can distinguish files that
contain qid and those that do not.
* Update MetaInfo test
* Simply group assignment logic
* Explicitly set qid=nullptr in NativeDataIter
NativeDataIter's callback does not support qid field. Users of NativeDataIter
will need to call setGroup() function separately to set group information.
* Save qids_ in SaveBinary()
* Upgrade dmlc-core submodule
* Add a test for reading qid
* Add contributor
* Check the size of qids_
* Document qid format
* Use sparse page as singular CSR matrix representation
* Simplify dmatrix methods
* Reduce statefullness of batch iterators
* BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.
* fix rebase conflict
* [core] additional gblinear improvements
* [R] callback for gblinear coefficients history
* force eta=1 for gblinear python tests
* add top_k to GreedyFeatureSelector
* set eta=1 in shotgun test
* [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater
* set sorted flag within TryInitColData
* gblinear tests: use scale, add external memory test
* fix multiclass for greedy updater
* fix whitespace
* fix typo
* Fatal error if GPU algorithm selected without GPU support compiled
* Resolve type conversion warnings
* Fix gpu unit test failure
* Fix compressed iterator edge case
* Fix python unit test failures due to flake8 update on pip
Use int32_t explicitly when serializing version field of dmatrix in binary
format. On ILP64 architectures, although very little, size of int is 64 bits.
* Fix compilation on OS X with GCC 7
Compilation failed with
In file included from src/tree/tree_updater.cc:6:0:
include/xgboost/tree_updater.h:75:46: error: 'function' is not a member of 'std'
std::function<TreeUpdater* ()> > {
caused by a missing <functional> include.
* Fixed another occurence of that issue spotted by @ClimberPG
* Fix various typos
* Add override to functions that are overridden
gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.
* Use bst_float consistently
Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.
In some cases, when due to additions and so on the value can
take a larger value, double is used.
This ensures that type conversions are minimal and reduces loss of
precision.
* Add format to the params accepted by DumpModel
Currently, only the test format is supported when trying to dump
a model. The plan is to add more such formats like JSON which are
easy to read and/or parse by machines. And to make the interface
for this even more generic to allow other formats to be added.
Hence, we make some modifications to make these function generic
and accept a new parameter "format" which signifies the format of
the dump to be created.
* Fix typos and errors in docs
* plugin: Mention all the register macros available
Document the register macros currently available to the plugin
writers so they know what exactly can be extended using hooks.
* sparce_page_source: Use same arg name in .h and .cc
* gbm: Add JSON dump
The dump_format argument can be used to specify what type
of dump file should be created. Add functionality to dump
gblinear and gbtree into a JSON file.
The JSON file has an array, each item is a JSON object for the tree.
For gblinear:
- The item is the bias and weights vectors
For gbtree:
- The item is the root node. The root node has a attribute "children"
which holds the children nodes. This happens recursively.
* core.py: Add arg dump_format for get_dump()