xgboost

Author	SHA1	Message	Date
Jiaming Yuan	24241ed6e3	[EM] Compress dense ellpack. (#10821 ) This helps reduce the memory copying needed for dense data. In addition, it helps reduce memory usage even if external memory is not used. - Decouple the number of symbols needed in the compressor with the number of features when the data is dense. - Remove the fetch call in the `at_end_` iteration. - Reduce synchronization and kernel launches by using the `uvector` and ctx.	2024-09-20 18:20:56 +08:00
Jiaming Yuan	5f7f31d464	[EM] Refactor ellpack construction. (#10810 ) - Remove the calculation of n_symbols in the accessor. - Pack initialization steps into the parameter list. - Pass the context into various ctors. - Specialization for dense data to prepare for further compression.	2024-09-09 14:10:10 +08:00
Jiaming Yuan	bde1265caf	[EM] Return a full DMatrix instead of a Ellpack from the GPU sampler. (#10753 )	2024-08-28 01:05:11 +08:00
Jiaming Yuan	8d7fe262d9	[EM] Enable access to the number of batches. (#10691 ) - Expose `NumBatches` in `DMatrix`. - Small cleanup for removing legacy CUDA stream and ~force CUDA context initialization~. - Purge old external memory data generation code.	2024-08-17 02:59:45 +08:00
Jiaming Yuan	292bb677e5	[EM] Support mmap backed ellpack. (#10602 ) - Support resource view in ellpack. - Define the CUDA version of MMAP resource. - Define the CUDA version of malloc resource. - Refactor cuda runtime API wrappers, and add memory access related wrappers. - gather windows macros into a single header.	2024-07-18 08:20:21 +08:00
Jiaming Yuan	e8a962575a	[EM] Allow staging ellpack on host for GPU external memory. (#10488 ) - New parameter `on_host`. - Abstract format creation and stream creation into policy classes.	2024-06-28 04:42:18 +08:00
Jiaming Yuan	e5f1720656	[EM] Avoid writing cut matrix to cache. (#10444 )	2024-06-19 18:03:38 +08:00
Jiaming Yuan	8c676c889d	Remove internal use of gpu_id. (#9568 )	2023-09-20 23:29:51 +08:00
Jiaming Yuan	20c52f07d2	Support exporting cut values (#9356 )	2023-07-08 15:32:41 +08:00
Jiaming Yuan	08ce495b5d	Use Booster context in DMatrix. (#8896 ) - Pass context from booster to DMatrix. - Use context instead of integer for `n_threads`. - Check the consistency configuration for `max_bin`. - Test for all combinations of initialization options.	2023-04-28 21:47:14 +08:00
Jiaming Yuan	16bca5d4a1	Support CPU input for device `QuantileDMatrix`. (#8136 ) - Copy `GHistIndexMatrix` to `Ellpack` when needed.	2022-08-11 21:21:26 +08:00
Jiaming Yuan	2775c2a1ab	Prepare external memory support for hist. (#7638 ) This PR prepares the GHistIndexMatrix to host the column matrix which is used by the hist tree method by accepting sparse_threshold parameter. Some cleanups are made to ensure the correct batch param is being passed into DMatrix along with some additional tests for correctness of SimpleDMatrix.	2022-02-10 16:58:02 +08:00
Jiaming Yuan	bd1f3a38f0	Rewrite sparse dmatrix using callbacks. (#7092 ) - Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves. - Remove use of threaded iterator and IO queue. - Remove `page_size`. - Make sure the number of pages in memory is bounded. - Make sure the cache can not be violated. - Provide an interface for internal algorithms to process data asynchronously.	2021-07-16 12:33:31 +08:00
Jiaming Yuan	1c8fdf2218	Remove use of `device_idx` in `dh::LaunchN`. (#7063 ) It's an unused parameter, removing it can make the CI log more readable.	2021-06-29 11:37:26 +08:00
Jiaming Yuan	bed7ae4083	Loop over `thrust::reduce`. (#6229 ) * Check input chunk size of dqdm. * Add doc for current limitation.	2020-10-14 10:40:56 +13:00
Jiaming Yuan	14afdb4d92	Support categorical data in ellpack. (#6140 )	2020-09-24 19:28:57 +08:00
Jiaming Yuan	6671b42dd4	Use ellpack for prediction only when sparsepage doesn't exist. (#5504 )	2020-04-10 12:15:46 +08:00
Jiaming Yuan	0012f2ef93	Upgrade clang-tidy on CI. (#5469 ) * Correct all clang-tidy errors. * Upgrade clang-tidy to 10 on CI. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-04-05 04:42:29 +08:00
Jiaming Yuan	459b175dc6	Split up test helpers header. (#5455 )	2020-04-03 10:36:53 +08:00
Jiaming Yuan	4942da64ae	Refactor tests with data generator. (#5439 )	2020-03-27 06:44:44 +08:00
Rory Mitchell	b745b7acce	Fix memory usage of device sketching (#5407 )	2020-03-14 13:43:24 +13:00
Rory Mitchell	3ad4333b0e	Partial rewrite EllpackPage (#5352 )	2020-03-11 10:15:53 +13:00
Jiaming Yuan	655cf17b60	Predict on Ellpack. (#5327 ) * Unify GPU prediction node. * Add `PageExists`. * Dispatch prediction on input data for GPU Predictor.	2020-02-23 06:27:03 +08:00
Rong Ou	e4b74c4d22	Gradient based sampling for GPU Hist (#5093 ) * Implement gradient based sampling for GPU Hist tree method. * Add samplers and handle compacted page in GPU Hist.	2020-02-04 10:31:27 +08:00
Rong Ou	5b1715d97c	Write ELLPACK pages to disk (#4879 ) * add ellpack source * add batch param * extract function to parse cache info * construct ellpack info separately * push batch to ellpack page * write ellpack page. * make sparse page source reusable	2019-10-22 23:44:32 -04:00
Rong Ou	125bcec62e	Move ellpack page construction into DMatrix (#4833 )	2019-09-16 23:50:55 -04:00

26 Commits