xgboost

Author	SHA1	Message	Date
Jiaming Yuan	d33854af1b	[Breaking] Accept multi-dim meta info. (#7405 ) This PR changes base_margin into a 3-dim array, with one of them being reserved for multi-target classification. Also, a breaking change is made for binary serialization due to extra dimension along with a fix for saving the feature weights. Lastly, it unifies the prediction initialization between CPU and GPU. After this PR, the meta info setter in Python will be based on array interface.	2021-11-18 23:02:54 +08:00
Jiaming Yuan	f20074e826	Check for invalid data. (#6742 )	2021-03-04 14:37:20 +08:00
Jiaming Yuan	f2f7dd87b8	Use view for `SparsePage` exclusively. (#6590 )	2021-01-11 18:04:55 +08:00
Philip Hyunsu Cho	487ab0ce73	[BLOCKING] Handle empty rows in data iterators correctly (#5929 ) * [jvm-packages] Handle empty rows in data iterators correctly * Fix clang-tidy error * last empty row * Add comments [skip ci] Co-authored-by: Nan Zhu <nanzhu@uber.com>	2020-07-25 13:46:19 -07:00
Jiaming Yuan	e1f22baf8c	Fix slice and get info. (#5552 )	2020-04-18 18:00:13 +08:00
Jiaming Yuan	6671b42dd4	Use ellpack for prediction only when sparsepage doesn't exist. (#5504 )	2020-04-10 12:15:46 +08:00
Jiaming Yuan	29c6ad943a	Prevent copying SimpleDMatrix. (#5453 ) * Set default dtor for SimpleDMatrix to initialize default copy ctor, which is deleted due to unique ptr. * Remove commented code. * Remove warning for calling host function (std::max). * Remove warning for initialization order. * Remove warning for unused variables.	2020-04-02 07:01:49 +08:00
Jiaming Yuan	4942da64ae	Refactor tests with data generator. (#5439 )	2020-03-27 06:44:44 +08:00
Jiaming Yuan	f2b8cd2922	Add number of columns to native data iterator. (#5202 ) * Change native data iter into an adapter.	2020-02-25 23:42:01 +08:00
Rory Mitchell	b0ed3f0a66	Remove unnecessary DMatrix methods (#5324 )	2020-02-25 12:40:39 +13:00
Rory Mitchell	b2b2c4e231	Remove SimpleCSRSource (#5315 )	2020-02-18 16:49:17 +13:00
Jiaming Yuan	fe8d72b50b	Cleanup warnings. (#5247 ) From clang-tidy-9 and gcc-7: Invalid case style, narrowing definition, wrong initialization order, unused variables.	2020-01-31 14:52:15 +08:00
Rory Mitchell	a73e25e15f	Implement slice via adapters (#5198 )	2020-01-14 12:55:41 +13:00
Rory Mitchell	c7cc657a4d	Use adapters for SparsePageDMatrix (#5092 )	2019-12-11 15:59:23 +13:00
Rory Mitchell	979f74d51a	Group builder modified for incremental building (#5098 )	2019-12-10 14:33:56 +13:00
Rory Mitchell	e3c34c79be	External data adapters (#5044 ) * Use external data adapters as lightweight intermediate layer between external data and DMatrix	2019-12-04 10:56:17 +13:00
Rong Ou	6edddd7966	Refactor DMatrix to return batches of different page types (#4686 ) * Use explicit template parameter for specifying page type.	2019-08-03 15:10:34 -04:00
Jiaming Yuan	7b9043cf71	Fix clang-tidy warnings. (#4149 ) * Upgrade gtest for clang-tidy. * Use CMake to install GTest instead of mv. * Don't enforce clang-tidy to return 0 due to errors in thrust. * Add a small test for tidy itself. * Reformat.	2019-03-13 02:25:51 +08:00
Philip Hyunsu Cho	abf2f661be	Fix #3708 : Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way (#3783 ) * Fix #3708: Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way Also install git inside NVIDIA GPU container * Update dmlc-core	2018-10-18 10:16:04 -07:00
Rory Mitchell	70d208d68c	Dmatrix refactor stage 2 (#3395 ) * DMatrix refactor 2 * Remove buffered rowset usage where possible * Transition to c++11 style iterators for row access * Transition column iterators to C++ 11	2018-10-01 01:29:03 +13:00
Andy Adinets	72cd1517d6	Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. (#3446 ) * Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. - added distributions to HostDeviceVector - using HostDeviceVector for labels, weights and base margings in MetaInfo - using HostDeviceVector for offset and data in SparsePage - other necessary refactoring * Added const version of HostDeviceVector API calls. - const versions added to calls that can trigger data transfers, e.g. DevicePointer() - updated the code that uses HostDeviceVector - objective functions now accept const HostDeviceVector<bst_float>& for predictions * Updated src/linear/updater_gpu_coordinate.cu. * Added read-only state for HostDeviceVector sync. - this means no copies are performed if both host and devices access the HostDeviceVector read-only * Fixed linter and test errors. - updated the lz4 plugin - added ConstDeviceSpan to HostDeviceVector - using device % dh::NVisibleDevices() for the physical device number, e.g. in calls to cudaSetDevice() * Fixed explicit template instantiation errors for HostDeviceVector. - replaced HostDeviceVector<unsigned int> with HostDeviceVector<int> * Fixed HostDeviceVector tests that require multiple GPUs. - added a mock set device handler; when set, it is called instead of cudaSetDevice()	2018-08-30 14:28:47 +12:00
trivialfis	cf2d86a4f6	Add travis sanitizers tests. (#3557 ) * Add travis sanitizers tests. * Add gcc-7 in Travis. * Add SANITIZER_PATH for CMake. * Enable sanitizer tests in Travis. * Fix memory leaks in tests. * Fix all memory leaks reported by Address Sanitizer. * tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.	2018-08-19 16:40:30 +12:00
trivialfis	2c502784ff	Span class. (#3548 ) * Add basic Span class based on ISO++20. * Use Span<Entry const> instead of Inst in SparsePage. * Add DeviceSpan in HostDeviceVector, use it in regression obj.	2018-08-14 17:58:11 +12:00
Rory Mitchell	a96039141a	Dmatrix refactor stage 1 (#3301 ) * Use sparse page as singular CSR matrix representation * Simplify dmatrix methods * Reduce statefullness of batch iterators * BREAKING CHANGE: Remove prob_buffer_row parameter. Users are instead recommended to sample their dataset as a preprocessing step before using XGBoost.	2018-06-07 10:25:58 +12:00
Rory Mitchell	ccf80703ef	Clang-tidy static analysis (#3222 ) * Clang-tidy static analysis * Modernise checks * Google coding standard checks * Identifier renaming according to Google style	2018-04-19 18:57:13 +12:00
Rory Mitchell	10eb05a63a	Refactor linear modelling and add new coordinate descent updater (#3103 ) * Refactor linear modelling and add new coordinate descent updater * Allow unsorted column iterator * Add prediction cacheing to gblinear	2018-02-17 09:17:01 +13:00
Thejaswi	85b2fb3eee	[GPU-Plugin] Integration of a faster version of grow_gpu plugin into mainstream (#2360 ) * Integrating a faster version of grow_gpu plugin 1. Removed the older files to reduce duplication 2. Moved all of the grow_gpu files under 'exact' folder 3. All of them are inside 'exact' namespace to avoid any conflicts 4. Fixed a bug in benchmark.py while running only 'grow_gpu' plugin 5. Added cub and googletest submodules to ease integration and unit-testing 6. Updates to CMakeLists.txt to directly build cuda objects into libxgboost * Added support for building gpu plugins through make flow 1. updated makefile and config.mk to add right targets 2. added unit-tests for gpu exact plugin code * 1. Added support for building gpu plugin using 'make' flow as well 2. Updated instructions for building and testing gpu plugin * Fix travis-ci errors for PR#2360 1. lint errors on unit-tests 2. removed googletest, instead depended upon dmlc-core provide gtest cache * Some more fixes to travis-ci lint failures PR#2360 * Added Rory's copyrights to the files containing code from both. * updated copyright statement as per Rory's request * moved the static datasets into a script to generate them at runtime * 1. memory usage print when silent=0 2. tests/ and test/ folder organization 3. removal of the dependency of googletest for just building xgboost 4. coding style updates for .cuh as well * Fixes for compilation warnings * add cuda object files as well when JVM_BINDINGS=ON	2017-06-06 09:39:53 +12:00
AbdealiJK	be0f55d563	tests/cpp: Add tests for SimpleDMatrix	2016-12-04 11:25:57 -08:00

28 Commits