28 Commits

Author SHA1 Message Date
Jiaming Yuan
5f7f31d464
[EM] Refactor ellpack construction. (#10810)
- Remove the calculation of n_symbols in the accessor.
- Pack initialization steps into the parameter list.
- Pass the context into various ctors.
- Specialization for dense data to prepare for further compression.
2024-09-09 14:10:10 +08:00
Jiaming Yuan
61dd854a52
[EM] Refactor GPU histogram builder. (#10764)
- Expose the maximum number of cached nodes to be consistent with the CPU implementation. Also easier for testing.
- Extract the subtraction trick for easier testing.
- Split up the `GradientQuantiser` to avoid circular dependency.
2024-08-30 02:39:14 +08:00
Jiaming Yuan
cc3b56fc37
Cleanup GPU Hist tests. (#10677)
* Cleanup GPU Hist tests.

- Remove GPU Hist gradient sampling test. The same properties are tested in the gradient
  sampler test suite.
- Move basic histogram tests into the histogram test suite.
- Remove the header inclusion of the `updater_gpu_hist.cu` in tests.
2024-08-06 11:50:44 +08:00
Jiaming Yuan
a19bbc9be5
Avoid caching allocator for large allocations. (#10582) 2024-07-23 03:48:03 +08:00
Jiaming Yuan
6d9fcb771e
Move device histogram storage into histogram.cuh. (#10608) 2024-07-21 14:10:13 +08:00
Jiaming Yuan
292bb677e5
[EM] Support mmap backed ellpack. (#10602)
- Support resource view in ellpack.
- Define the CUDA version of MMAP resource.
- Define the CUDA version of malloc resource.
- Refactor cuda runtime API wrappers, and add memory access related wrappers.
- gather windows macros into a single header.
2024-07-18 08:20:21 +08:00
Jiaming Yuan
5f910cd4ff
[EM] Handle base idx in GPU histogram. (#10549) 2024-07-11 03:26:30 +08:00
Jiaming Yuan
620b2b155a
Cache GPU histogram kernel configuration. (#10538) 2024-07-04 15:38:59 +08:00
Jiaming Yuan
230010d9a0
Cleanup set info. (#10139)
- Use the array interface internally.
- Deprecate `XGDMatrixSetDenseInfo`.
- Deprecate `XGDMatrixSetUIntInfo`.
- Move the handling of `DataType` into the deprecated C function.

---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-03-26 23:26:24 +08:00
Jiaming Yuan
06bdc15e9b
[coll] Pass context to various functions. (#9772)
* [coll] Pass context to various functions.

In the future, the `Context` object would be required for collective operations, this PR
passes the context object to some required functions to prepare for swapping out the
implementation.
2023-11-08 09:54:05 +08:00
Jiaming Yuan
8c676c889d
Remove internal use of gpu_id. (#9568) 2023-09-20 23:29:51 +08:00
Rong Ou
9bab06cbca
Support column split in gpu hist updater (#9384) 2023-08-31 18:09:35 +08:00
Jiaming Yuan
08ce495b5d
Use Booster context in DMatrix. (#8896)
- Pass context from booster to DMatrix.
- Use context instead of integer for `n_threads`.
- Check the consistency configuration for `max_bin`.
- Test for all combinations of initialization options.
2023-04-28 21:47:14 +08:00
Jiaming Yuan
c6a8754c62
Define CUDA Context. (#8604)
We will transition to non-default and non-blocking CUDA stream.
2022-12-20 15:15:07 +08:00
Rory Mitchell
210915c985
Use integer gradients in gpu_hist split evaluation (#8274) 2022-10-11 12:16:27 +02:00
Rory Mitchell
8f77677193
Use quantised gradients in gpu_hist histograms (#8246) 2022-09-26 17:35:35 +02:00
Jiaming Yuan
142a208a90
Fix compiler warnings. (#8022)
- Remove/fix unused parameters
- Remove deprecated code in rabit.
- Update dmlc-core.
2022-06-22 21:29:10 +08:00
Rory Mitchell
71d3b2e036
Fuse gpu_hist all-reduce calls where possible (#7867) 2022-05-17 13:27:50 +02:00
Jiaming Yuan
d7d1b6e3a6
CPU evaluation for cat data. (#7393)
* Implementation for one hot based.
* Implementation for partition based. (LightGBM)
2021-11-06 14:41:35 +08:00
Jiaming Yuan
bd1f3a38f0
Rewrite sparse dmatrix using callbacks. (#7092)
- Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves.
- Remove use of threaded iterator and IO queue.
- Remove `page_size`.
- Make sure the number of pages in memory is bounded.
- Make sure the cache can not be violated.
- Provide an interface for internal algorithms to process data asynchronously.
2021-07-16 12:33:31 +08:00
Jiaming Yuan
14afdb4d92
Support categorical data in ellpack. (#6140) 2020-09-24 19:28:57 +08:00
Andy Adinets
ac3f0e78dc
Split Features into Groups to Compute Histograms in Shared Memory (#5795) 2020-07-07 15:04:35 +12:00
Andy Adinets
73142041b9
For histograms, opting into maximum shared memory available per block. (#5491) 2020-04-21 14:56:42 +12:00
Jiaming Yuan
6671b42dd4
Use ellpack for prediction only when sparsepage doesn't exist. (#5504) 2020-04-10 12:15:46 +08:00
Jiaming Yuan
4942da64ae
Refactor tests with data generator. (#5439) 2020-03-27 06:44:44 +08:00
Rory Mitchell
b745b7acce
Fix memory usage of device sketching (#5407) 2020-03-14 13:43:24 +13:00
Rory Mitchell
3ad4333b0e
Partial rewrite EllpackPage (#5352) 2020-03-11 10:15:53 +13:00
Jiaming Yuan
8d06878bf9
Deterministic GPU histogram. (#5361)
* Use pre-rounding based method to obtain reproducible floating point
  summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
2020-03-04 15:13:28 +08:00