xgboost

Author	SHA1	Message	Date
Jiaming Yuan	e228c1a121	[EM] Make page concatenation optional. (#10826 ) This PR introduces a new parameter `extmem_concat_pages` to make the page concatenation optional for GPU hist. In addition, the document is updated for the new GPU-based external memory.	2024-09-24 06:19:28 +08:00
Jiaming Yuan	2a37a8880c	Check correct dump format for gblinear. (#10831 )	2024-09-21 00:32:52 +08:00
Jiaming Yuan	24241ed6e3	[EM] Compress dense ellpack. (#10821 ) This helps reduce the memory copying needed for dense data. In addition, it helps reduce memory usage even if external memory is not used. - Decouple the number of symbols needed in the compressor with the number of features when the data is dense. - Remove the fetch call in the `at_end_` iteration. - Reduce synchronization and kernel launches by using the `uvector` and ctx.	2024-09-20 18:20:56 +08:00
Jiaming Yuan	96bbf80457	[EM] Suport quantile objectives for GPU-based external memory. (#10820 ) - Improved error message for memory usage. - Support quantile-based objectives for GPU external memory.	2024-09-17 13:27:02 +08:00
Jiaming Yuan	d94f6679fc	[EM] Avoid synchronous calls and unnecessary ATS access. (#10811 ) - Pass context into various functions. - Factor out some CUDA algorithms. - Use ATS only for update position.	2024-09-10 14:33:14 +08:00
Jiaming Yuan	ed5f33df16	[EM] Multi-level quantile sketching for GPU. (#10813 )	2024-09-10 13:08:34 +08:00
Jiaming Yuan	5f7f31d464	[EM] Refactor ellpack construction. (#10810 ) - Remove the calculation of n_symbols in the accessor. - Pack initialization steps into the parameter list. - Pass the context into various ctors. - Specialization for dense data to prepare for further compression.	2024-09-09 14:10:10 +08:00
Jiaming Yuan	e1a2c1bbb3	[EM] Merge GPU partitioning with histogram building. (#10766 ) - Stop concatenating pages if there's no subsampling. - Use a single iteration for histogram build and partitioning.	2024-08-31 03:25:37 +08:00
Jiaming Yuan	98ac153265	Avoid warning from NVCC. (#10757 )	2024-08-30 16:11:31 +08:00
Jiaming Yuan	34d4ab455e	[EM] Avoid stream sync in quantile sketching. (#10765 ) .	2024-08-30 12:33:24 +08:00
Jiaming Yuan	61dd854a52	[EM] Refactor GPU histogram builder. (#10764 ) - Expose the maximum number of cached nodes to be consistent with the CPU implementation. Also easier for testing. - Extract the subtraction trick for easier testing. - Split up the `GradientQuantiser` to avoid circular dependency.	2024-08-30 02:39:14 +08:00
Jiaming Yuan	4fe67f10b4	[EM] Have one partitioner for each batch. (#10760 ) - Initialize one partitioner for each batch. - Collect partition size during initialization. - Support base ridx in the finalization.	2024-08-29 01:35:17 +08:00
Jiaming Yuan	64afe9873b	Increase timeout in C++ tests from 1 to 5 seconds. (#10756 ) To avoid CI failures on FreeBSD.	2024-08-28 02:27:14 +08:00
Jiaming Yuan	bde1265caf	[EM] Return a full DMatrix instead of a Ellpack from the GPU sampler. (#10753 )	2024-08-28 01:05:11 +08:00
Jiaming Yuan	d6ebcfb032	[EM] Support CPU quantile objective for external memory. (#10751 )	2024-08-27 04:16:57 +08:00
Jiaming Yuan	25966e4ba8	[EM] Pass batch parameter into extmem format. (#10736 ) - Allow customization for format reading. - Customize the number of pre-fetch batches.	2024-08-27 02:37:50 +08:00
Jiaming Yuan	fd0138c91c	[coll] Improve column split tests with named threads. (#10735 )	2024-08-24 12:43:47 +08:00
Jiaming Yuan	55aef8f546	[EM] Avoid resizing host cache. (#10734 ) * [EM] Avoid resizing host cache. - Add SAM allocator and resource. - Use page-based cache instead of stream-based cache.	2024-08-23 06:34:01 +08:00
Jiaming Yuan	142bdc73ec	[EM] Support SHAP contribution with QDM. (#10724 ) - Add GPU support. - Add external memory support. - Update the GPU tree shap.	2024-08-22 05:25:10 +08:00
Jiaming Yuan	cb54374550	Update clang-tidy. (#10730 ) - Install cmake using pip. - Fix compile command generation. - Clean up the tidy script and remove the need to load the yaml file. - Fix modernized type traits. - Fix span class. Polymorphism support is dropped	2024-08-22 04:12:18 +08:00
Dmitry Razdoburdin	24d225c1ab	[SYCL] Implement UpdatePredictionCache and connect updater with leraner. (#10701 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-08-22 02:07:44 +08:00
Jiaming Yuan	402e7837fb	Fix potential race in feature constraint. (#10719 )	2024-08-21 16:50:31 +08:00
Jiaming Yuan	508ac13243	Check cub errors. (#10721 ) - Make sure cuda error returned by cub scan is caught. - Avoid temporary buffer allocation in thrust device vector.	2024-08-21 02:50:26 +08:00
Jiaming Yuan	ec3f327c20	Add managed memory allocator. (#10711 )	2024-08-17 03:02:34 +08:00
Jiaming Yuan	8d7fe262d9	[EM] Enable access to the number of batches. (#10691 ) - Expose `NumBatches` in `DMatrix`. - Small cleanup for removing legacy CUDA stream and ~force CUDA context initialization~. - Purge old external memory data generation code.	2024-08-17 02:59:45 +08:00
Jiaming Yuan	abe65e3769	Reduce thread contention in column split histogram test. (#10708 )	2024-08-17 01:00:32 +08:00
Jiaming Yuan	582ea104b5	[EM] Enable prediction cache for GPU. (#10707 ) - Use `UpdatePosition` for all nodes and skip `FinalizePosition` when external memory is used. - Create `encode/decode` for node position, this is just as a refactor. - Reuse code between update position and finalization.	2024-08-15 21:41:59 +08:00
Dmitry Razdoburdin	773ded684b	[sycl] Add depth-wise policy (#10690 ) Co-authored-by: Dmitry Razdoburdin <>	2024-08-13 18:12:35 +08:00
Jiaming Yuan	2ecc85ffad	[EM] Support ExtMemQdm in the GPU predictor. (#10694 )	2024-08-13 12:21:11 +08:00
Jiaming Yuan	43704549a2	[coll] Reduce the amount of open files (socket). (#10693 ) Reduce the chance of hitting `Failed to call `socket`: Too many open files`.	2024-08-13 05:23:49 +08:00
Jiaming Yuan	d414fdf2e7	[EM] Add GPU version of the external memory QDM. (#10689 )	2024-08-10 10:49:43 +08:00
Jiaming Yuan	7bccc1ea2c	[EM] CPU implementation for external memory QDM. (#10682 ) - A new DMatrix type. - Extract common code into a new QDM base class. Not yet working: - Not exposed to the interface yet, will wait for the GPU implementation. - ~No meta info yet, still working on the source.~ - Exporting data to CSR is not supported yet.	2024-08-09 09:38:02 +08:00
Dmitry Razdoburdin	e555a238bc	[SYCL]. Add implementation for loss-guided policy (#10681 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-08-09 09:04:46 +08:00
Jiaming Yuan	cc3b56fc37	Cleanup GPU Hist tests. (#10677 ) * Cleanup GPU Hist tests. - Remove GPU Hist gradient sampling test. The same properties are tested in the gradient sampler test suite. - Move basic histogram tests into the histogram test suite. - Remove the header inclusion of the `updater_gpu_hist.cu` in tests.	2024-08-06 11:50:44 +08:00
Jiaming Yuan	77c844cef7	Reduce thread contention in column split tests. (#10658 ) --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2024-08-01 18:36:46 +08:00
Dmitry Razdoburdin	7720272870	[sycl] add split applications and tests (#10636 ) Co-authored-by: Dmitry Razdoburdin <>	2024-07-26 15:25:49 +08:00
Jiaming Yuan	a19bbc9be5	Avoid caching allocator for large allocations. (#10582 )	2024-07-23 03:48:03 +08:00
Dmitry Razdoburdin	f6cae4da85	[SYCL] Add splits evaluation (#10605 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-07-22 18:14:06 +08:00
Jiaming Yuan	6d9fcb771e	Move device histogram storage into `histogram.cuh`. (#10608 )	2024-07-21 14:10:13 +08:00
Jiaming Yuan	292bb677e5	[EM] Support mmap backed ellpack. (#10602 ) - Support resource view in ellpack. - Define the CUDA version of MMAP resource. - Define the CUDA version of malloc resource. - Refactor cuda runtime API wrappers, and add memory access related wrappers. - gather windows macros into a single header.	2024-07-18 08:20:21 +08:00
Jiaming Yuan	e9fbce9791	Refactor `DeviceUVector`. (#10595 ) Create a wrapper instead of using inheritance to avoid inconsistent interface of the class.	2024-07-18 03:33:01 +08:00
Jiaming Yuan	a6a8a55ffa	Merge approx tests. (#10583 )	2024-07-16 19:03:48 +08:00
Jiaming Yuan	6c403187ec	Fix column split race condition. (#10572 )	2024-07-12 01:07:12 +08:00
Jiaming Yuan	1ca4bfd20e	Avoid thrust vector initialization. (#10544 ) * Avoid thrust vector initialization. - Add a wrapper for rmm device uvector. - Split up the `Resize` method for HDV.	2024-07-11 17:29:27 +08:00
Jiaming Yuan	89da9f9741	[fed] Split up federated test CMake file. (#10566 ) - Collect all federated test files into the same directory. - Independently list the files.	2024-07-11 13:09:18 +08:00
Jiaming Yuan	5f910cd4ff	[EM] Handle base idx in GPU histogram. (#10549 )	2024-07-11 03:26:30 +08:00
Jiaming Yuan	34b154c284	Avoid the use of size_t in the partitioner. (#10541 ) - Avoid the use of size_t in the partitioner. - Use `Span` instead of `Elem` where `node_id` is not needed. - Remove the `const_cast`. - Make sure the constness is not removed in the `Elem` by making it reference only. size_t is implementation-defined, which causes issue when we want to pass pointer or span.	2024-07-11 00:43:08 +08:00
Jiaming Yuan	620b2b155a	Cache GPU histogram kernel configuration. (#10538 )	2024-07-04 15:38:59 +08:00
Jiaming Yuan	628411a654	Enhance the threadpool implementation. (#10531 ) - Accept an initialization function. - Support void return tasks.	2024-07-03 12:13:27 +08:00
Jiaming Yuan	9cb4c938da	[EM] Move prefetch in reset into the end of the iteration. (#10529 )	2024-07-03 03:48:18 +08:00

1 2 3 4 5 ...

793 Commits