12 Commits

Author SHA1 Message Date
Rory Mitchell
a96039141a
Dmatrix refactor stage 1 (#3301)
* Use sparse page as singular CSR matrix representation

* Simplify dmatrix methods

* Reduce statefullness of batch iterators

* BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.
2018-06-07 10:25:58 +12:00
Rory Mitchell
ccf80703ef
Clang-tidy static analysis (#3222)
* Clang-tidy static analysis

* Modernise checks

* Google coding standard checks

* Identifier renaming according to Google style
2018-04-19 18:57:13 +12:00
Vadim Khotilovich
706be4e5d4
Additional improvements for gblinear (#3134)
* fix rebase conflict

* [core] additional gblinear improvements

* [R] callback for gblinear coefficients history

* force eta=1 for gblinear python tests

* add top_k to GreedyFeatureSelector

* set eta=1 in shotgun test

* [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater

* set sorted flag within TryInitColData

* gblinear tests: use scale, add external memory test

* fix multiclass for greedy updater

* fix whitespace

* fix typo
2018-03-13 01:27:13 -05:00
Rory Mitchell
10eb05a63a
Refactor linear modelling and add new coordinate descent updater (#3103)
* Refactor linear modelling and add new coordinate descent updater

* Allow unsorted column iterator

* Add prediction cacheing to gblinear
2018-02-17 09:17:01 +13:00
Qin Xiaoming
12cf0ae122 Update sparse_page_dmatrix.h (#2139) 2017-03-23 11:01:40 -07:00
AbdealiJK
6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
tqchen
88447ca32e [MEM] Add rowset struct to save memory with billion level rows 2016-02-10 11:17:17 -08:00
tqchen
2230f1273f [DISK] Add shard option to disk 2016-02-10 11:17:17 -08:00
tqchen
96f4542a67 [PLUGIN] Add plugin system 2016-01-16 10:25:11 -08:00
tqchen
36c389ac46 [DATA] Isolate the format of page file 2016-01-16 10:25:11 -08:00
tqchen
2dc6c2dc52 [R] enable R compile
[R] Enable R build for windows and linux
2016-01-16 10:24:02 -08:00
tqchen
ef1021e759 [IO] Enable external memory 2016-01-16 10:24:01 -08:00