xgboost

Author	SHA1	Message	Date
Jiaming Yuan	7bccc1ea2c	[EM] CPU implementation for external memory QDM. (#10682 ) - A new DMatrix type. - Extract common code into a new QDM base class. Not yet working: - Not exposed to the interface yet, will wait for the GPU implementation. - ~No meta info yet, still working on the source.~ - Exporting data to CSR is not supported yet.	2024-08-09 09:38:02 +08:00
Jiaming Yuan	b2cae34a8e	Fix integer overflow. (#10615 )	2024-07-23 02:13:15 +08:00
Jiaming Yuan	cb62f9e73b	[EM] Prevent init with CUDA malloc resource. (#10606 )	2024-07-21 05:08:29 +08:00
Jiaming Yuan	292bb677e5	[EM] Support mmap backed ellpack. (#10602 ) - Support resource view in ellpack. - Define the CUDA version of MMAP resource. - Define the CUDA version of malloc resource. - Refactor cuda runtime API wrappers, and add memory access related wrappers. - gather windows macros into a single header.	2024-07-18 08:20:21 +08:00
Dmitry Razdoburdin	513d7a7d84	[sycl] Reorder if-else statements to allow using of cpu branches for sycl-devices (#10543 ) * reoder if-else statements for sycl compatibility * trigger check --------- Co-authored-by: Dmitry Razdoburdin <>	2024-07-05 16:31:48 +08:00
Jiaming Yuan	628411a654	Enhance the threadpool implementation. (#10531 ) - Accept an initialization function. - Support void return tasks.	2024-07-03 12:13:27 +08:00
Jiaming Yuan	9cb4c938da	[EM] Move prefetch in reset into the end of the iteration. (#10529 )	2024-07-03 03:48:18 +08:00
Jiaming Yuan	e8a962575a	[EM] Allow staging ellpack on host for GPU external memory. (#10488 ) - New parameter `on_host`. - Abstract format creation and stream creation into policy classes.	2024-06-28 04:42:18 +08:00
Jiaming Yuan	e5f1720656	[EM] Avoid writing cut matrix to cache. (#10444 )	2024-06-19 18:03:38 +08:00
Jiaming Yuan	b4cc350ec5	Fix categorical data with external memory. (#10433 )	2024-06-18 04:34:54 +08:00
Jiaming Yuan	49e25cfb36	Allow unaligned pointer if the array is empty. (#10418 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2024-06-15 19:10:21 +08:00
Jiaming Yuan	0808e50ae8	Sync stream in ellpack format. (#10374 )	2024-06-04 12:58:26 +08:00
Jiaming Yuan	d2d01d977a	Remove unnecessary fetch operations in external memory. (#10342 )	2024-05-31 13:16:40 +08:00
Jiaming Yuan	a5a58102e5	Revamp the rabit implementation. (#10112 ) This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features: - Federated learning for both CPU and GPU. - NCCL. - More data types. - A unified interface for all the underlying implementations. - Improved timeout handling for both tracker and workers. - Exhausted tests with metrics (fixed a couple of bugs along the way). - A reusable tracker for Python and JVM packages.	2024-05-20 11:56:23 +08:00
Jiaming Yuan	835e59e538	Use a thread pool for external memory. (#10288 )	2024-05-16 19:32:12 +08:00
Jiaming Yuan	1022909bbe	Fix global config for external memory. (#10173 ) Pass the thread-local configuration between threads.	2024-04-11 01:29:28 +08:00
Jiaming Yuan	230010d9a0	Cleanup set info. (#10139 ) - Use the array interface internally. - Deprecate `XGDMatrixSetDenseInfo`. - Deprecate `XGDMatrixSetUIntInfo`. - Move the handling of `DataType` into the deprecated C function. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2024-03-26 23:26:24 +08:00
Jiaming Yuan	53fc17578f	Use `std::uint64_t` for row index. (#10120 ) - Use std::uint64_t instead of size_t to avoid implementation-defined type. - Rename to bst_idx_t, to account for other types of indexing. - Small cleanup to the base header.	2024-03-15 18:43:49 +08:00
Jiaming Yuan	56b1868278	Fix compilation with the latest ctk. (#10123 )	2024-03-15 08:04:41 +08:00
Philip Hyunsu Cho	4dfbe2a893	[CI] Test building for 32-bit arch (#10021 ) * [CI] Test building for 32-bit arch * Update CMakeLists.txt * Fix yaml * Use Debian container * Remove -Werror for 32-bit * Revert "Remove -Werror for 32-bit" This reverts commit c652bc6a037361bcceaf56fb01863210b462793d. * Don't error for overloaded-virtual warning * Ignore some warnings from dmlc-core * Fix compiler warnings * Fix formatting * Apply suggestions from code review Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com> * Add more cast --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2024-01-31 13:20:51 -08:00
Jiaming Yuan	a76d6c6131	Fix cpp deprecation. (#10010 )	2024-01-26 02:13:40 +08:00
Philip Hyunsu Cho	c8f5d190c6	[CI] Stop Windows pipeline upon a failing pytest (#10003 )	2024-01-24 22:54:21 -08:00
Jiaming Yuan	c03a4d5088	Check support status for categorical features. (#9946 )	2024-01-04 16:51:33 +08:00
david-cortes	3c004a4145	[R] Add missing DMatrix functions (#9929 ) * `XGDMatrixGetQuantileCut` * `XGDMatrixNumNonMissing` * `XGDMatrixGetDataAsCSR` --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2024-01-03 17:29:21 +08:00
Jiaming Yuan	faf0f2df10	Support dataframe data format in native XGBoost. (#9828 ) - Implement a columnar adapter. - Refactor Python pandas handling code to avoid converting into a single numpy array. - Add support in R for transforming columns. - Support R data.frame and factor type.	2023-12-12 09:56:31 +08:00
Jiaming Yuan	06bdc15e9b	[coll] Pass context to various functions. (#9772 ) * [coll] Pass context to various functions. In the future, the `Context` object would be required for collective operations, this PR passes the context object to some required functions to prepare for swapping out the implementation.	2023-11-08 09:54:05 +08:00
Jiaming Yuan	48ac9b6cbe	[coll] Allreduce. (#9679 )	2023-10-17 13:57:14 +08:00
Rong Ou	da6803b75b	Support column-wise data split with in-memory inputs (#9628 ) --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2023-10-17 12:16:39 +08:00
Rong Ou	e164d51c43	Improve allgather functions (#9649 )	2023-10-12 23:31:43 +08:00
Rong Ou	0ecb4de963	[breaking] Change DMatrix construction to be distributed (#9623 ) * Change column-split DMatrix construction to be distributed * remove splitting code for row split	2023-10-10 23:35:57 +08:00
Jiaming Yuan	d95be1c38d	Small cleanup to jvm iter adapter. (#9616 ) - Remove header dependency on c_api - Remove remaining code for arrow.	2023-09-29 00:39:07 +08:00
Jiaming Yuan	60526100e3	Support arrow through pandas ext types. (#9612 ) - Use pandas extension type for pyarrow support. - Additional support for QDM. - Additional support for inplace_predict.	2023-09-28 17:00:16 +08:00
Jiaming Yuan	1167e6c554	Limit the number of threads for external memory. (#9605 )	2023-09-24 00:30:28 +08:00
Jiaming Yuan	cac2cd2e94	[R] Set number of threads in demos and tests. (#9591 ) - Restrict the number of threads in IO. - Specify the number of threads in demos and tests. - Add helper scripts for checks.	2023-09-23 21:44:03 +08:00
Jiaming Yuan	8c676c889d	Remove internal use of gpu_id. (#9568 )	2023-09-20 23:29:51 +08:00
Jiaming Yuan	adea842c83	Fix inplace predict with fallback when base margin is used. (#9536 ) - Copy meta info from proxy DMatrix. - Use `std::call_once` to emit less warnings.	2023-09-05 01:04:24 +08:00
Jiaming Yuan	ddf2e68821	Use the new `DeviceOrd` in the linalg module. (#9527 )	2023-08-29 13:37:29 +08:00
Jiaming Yuan	972730cde0	Use matrix for gradient. (#9508 ) - Use the `linalg::Matrix` for storing gradients. - New API for the custom objective. - Custom objective for multi-class/multi-target is now required to return the correct shape. - Custom objective for Python can accept arrays with any strides. (row-major, column-major)	2023-08-24 05:29:52 +08:00
Jiaming Yuan	bb56183396	Normalize file system path. (#9463 )	2023-08-11 21:26:46 +08:00
Jiaming Yuan	f05294a6f2	Fix clang warnings. (#9447 ) - static function in header. (which is marked as unused due to translation unit visibility). - Implicit copy operator is deprecated. - Unused lambda capture. - Moving a temporary variable prevents copy elision.	2023-08-09 15:34:45 +08:00
Jiaming Yuan	54029a59af	Bound the size of the histogram cache. (#9440 ) - A new histogram collection with a limit in size. - Unify histogram building logic between hist, multi-hist, and approx.	2023-08-08 03:21:26 +08:00
Jiaming Yuan	912e341d57	Initial GPU support for the approx tree method. (#9414 )	2023-07-31 15:50:28 +08:00
Jiaming Yuan	a196443a07	Implement sketching with Hessian on GPU. (#9399 ) - Prepare for implementing approx on GPU. - Unify the code path between weighted and uniform sketching on DMatrix.	2023-07-24 15:43:03 +08:00
Jiaming Yuan	275da176ba	Document for device ordinal. (#9398 ) - Rewrite GPU demos. notebook is converted to script to avoid committing additional png plots. - Add GPU demos into the sphinx gallery. - Add RMM demos into the sphinx gallery. - Test for firing threads with different device ordinals.	2023-07-22 15:26:29 +08:00
Jiaming Yuan	04aff3af8e	Define the new `device` parameter. (#9362 )	2023-07-13 19:30:25 +08:00
Rong Ou	3632242e0b	Support column split with GPU quantile (#9370 )	2023-07-11 12:15:56 +08:00
Jiaming Yuan	20c52f07d2	Support exporting cut values (#9356 )	2023-07-08 15:32:41 +08:00
Jiaming Yuan	59787b23af	Allow empty page in external memory. (#9361 )	2023-07-08 09:24:35 +08:00
Jiaming Yuan	41c6813496	Preserve order of saved updaters config. (#9355 ) - Save the updater sequence as an array instead of object. - Warn only once. The compatibility is kept, but we should be able to break it as the config is not loaded in pickle model and it's declared to be not stable.	2023-07-05 20:20:07 +08:00
Jiaming Yuan	645037e376	Improve test coverage with predictor configuration. (#9354 ) * Improve test coverage with predictor configuration. - Test with ext memory. - Test with QDM. - Test with dart.	2023-07-05 15:17:22 +08:00

1 2 3 4 5 ...

359 Commits