Rong Ou
6edddd7966
Refactor DMatrix to return batches of different page types ( #4686 )
...
* Use explicit template parameter for specifying page type.
2019-08-03 15:10:34 -04:00
Jiaming Yuan
d9a47794a5
Fix CPU hist init for sparse dataset. ( #4625 )
...
* Fix CPU hist init for sparse dataset.
* Implement sparse histogram cut.
* Allow empty features.
* Fix windows build, don't use sparse in distributed environment.
* Comments.
* Smaller threshold.
* Fix windows omp.
* Fix msvc lambda capture.
* Fix MSVC macro.
* Fix MSVC initialization list.
* Fix MSVC initialization list x2.
* Preserve categorical feature behavior.
* Rename matrix to sparse cuts.
* Reuse UseGroup.
* Check for categorical data when adding cut.
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* Sanity check.
* Fix comments.
* Fix comment.
2019-07-04 16:27:03 -07:00
Rory Mitchell
70d208d68c
Dmatrix refactor stage 2 ( #3395 )
...
* DMatrix refactor 2
* Remove buffered rowset usage where possible
* Transition to c++11 style iterators for row access
* Transition column iterators to C++ 11
2018-10-01 01:29:03 +13:00
trivialfis
cf2d86a4f6
Add travis sanitizers tests. ( #3557 )
...
* Add travis sanitizers tests.
* Add gcc-7 in Travis.
* Add SANITIZER_PATH for CMake.
* Enable sanitizer tests in Travis.
* Fix memory leaks in tests.
* Fix all memory leaks reported by Address Sanitizer.
* tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.
2018-08-19 16:40:30 +12:00
trivialfis
2c502784ff
Span class. ( #3548 )
...
* Add basic Span class based on ISO++20.
* Use Span<Entry const> instead of Inst in SparsePage.
* Add DeviceSpan in HostDeviceVector, use it in regression obj.
2018-08-14 17:58:11 +12:00
PSEUDOTENSOR / Jonathan McKinney
9ac163d0bb
Allow import via python datatable. ( #3272 )
...
* Allow import via python datatable.
* Write unit tests
* Refactor dt API functions
* Refactor python code
* Lint fixes
* Address review comments
2018-06-20 13:16:18 -07:00
Rory Mitchell
a96039141a
Dmatrix refactor stage 1 ( #3301 )
...
* Use sparse page as singular CSR matrix representation
* Simplify dmatrix methods
* Reduce statefullness of batch iterators
* BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.
2018-06-07 10:25:58 +12:00
Rory Mitchell
ccf80703ef
Clang-tidy static analysis ( #3222 )
...
* Clang-tidy static analysis
* Modernise checks
* Google coding standard checks
* Identifier renaming according to Google style
2018-04-19 18:57:13 +12:00
PSEUDOTENSOR / Jonathan McKinney
6b375f6ad8
Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation ( #2530 )
...
* Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.
2017-07-21 14:43:17 +12:00