177 Commits

Author SHA1 Message Date
Jiaming Yuan
55aef8f546
[EM] Avoid resizing host cache. (#10734)
* [EM] Avoid resizing host cache.

- Add SAM allocator and resource.
- Use page-based cache instead of stream-based cache.
2024-08-23 06:34:01 +08:00
Jiaming Yuan
142bdc73ec
[EM] Support SHAP contribution with QDM. (#10724)
- Add GPU support.
- Add external memory support.
- Update the GPU tree shap.
2024-08-22 05:25:10 +08:00
Jiaming Yuan
cb54374550
Update clang-tidy. (#10730)
- Install cmake using pip.
- Fix compile command generation.
- Clean up the tidy script and remove the need to load the yaml file.
- Fix modernized type traits.
- Fix span class. Polymorphism support is dropped
2024-08-22 04:12:18 +08:00
Jiaming Yuan
8d7fe262d9
[EM] Enable access to the number of batches. (#10691)
- Expose `NumBatches` in `DMatrix`.
- Small cleanup for removing legacy CUDA stream and ~force CUDA context initialization~.
- Purge old external memory data generation code.
2024-08-17 02:59:45 +08:00
Jiaming Yuan
d414fdf2e7
[EM] Add GPU version of the external memory QDM. (#10689) 2024-08-10 10:49:43 +08:00
Jiaming Yuan
7bccc1ea2c
[EM] CPU implementation for external memory QDM. (#10682)
- A new DMatrix type.
- Extract common code into a new QDM base class.

Not yet working:
- Not exposed to the interface yet, will wait for the GPU implementation.
- ~No meta info yet, still working on the source.~
- Exporting data to CSR is not supported yet.
2024-08-09 09:38:02 +08:00
Jiaming Yuan
292bb677e5
[EM] Support mmap backed ellpack. (#10602)
- Support resource view in ellpack.
- Define the CUDA version of MMAP resource.
- Define the CUDA version of malloc resource.
- Refactor cuda runtime API wrappers, and add memory access related wrappers.
- gather windows macros into a single header.
2024-07-18 08:20:21 +08:00
Jiaming Yuan
1ca4bfd20e
Avoid thrust vector initialization. (#10544)
* Avoid thrust vector initialization.

- Add a wrapper for rmm device uvector.
- Split up the `Resize` method for HDV.
2024-07-11 17:29:27 +08:00
Jiaming Yuan
9cb4c938da
[EM] Move prefetch in reset into the end of the iteration. (#10529) 2024-07-03 03:48:18 +08:00
Jiaming Yuan
e8a962575a
[EM] Allow staging ellpack on host for GPU external memory. (#10488)
- New parameter `on_host`.
- Abstract format creation and stream creation into policy classes.
2024-06-28 04:42:18 +08:00
Jiaming Yuan
e5f1720656
[EM] Avoid writing cut matrix to cache. (#10444) 2024-06-19 18:03:38 +08:00
Jiaming Yuan
d2d01d977a
Remove unnecessary fetch operations in external memory. (#10342) 2024-05-31 13:16:40 +08:00
Jiaming Yuan
a5a58102e5
Revamp the rabit implementation. (#10112)
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
Jiaming Yuan
230010d9a0
Cleanup set info. (#10139)
- Use the array interface internally.
- Deprecate `XGDMatrixSetDenseInfo`.
- Deprecate `XGDMatrixSetUIntInfo`.
- Move the handling of `DataType` into the deprecated C function.

---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-03-26 23:26:24 +08:00
Jiaming Yuan
53fc17578f
Use std::uint64_t for row index. (#10120)
- Use std::uint64_t instead of size_t to avoid implementation-defined type.
- Rename to bst_idx_t, to account for other types of indexing.
- Small cleanup to the base header.
2024-03-15 18:43:49 +08:00
Jiaming Yuan
d07b7fe8c8
Small cleanup for mock tests. (#10085) 2024-03-04 23:32:11 +08:00
Jiaming Yuan
42de9206fc
Support multi-target, fit intercept for hinge. (#9850) 2023-12-08 05:50:41 +08:00
Rong Ou
da6803b75b
Support column-wise data split with in-memory inputs (#9628)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-10-17 12:16:39 +08:00
Rong Ou
0ecb4de963
[breaking] Change DMatrix construction to be distributed (#9623)
* Change column-split DMatrix construction to be distributed

* remove splitting code for row split
2023-10-10 23:35:57 +08:00
Jiaming Yuan
8c676c889d
Remove internal use of gpu_id. (#9568) 2023-09-20 23:29:51 +08:00
Jiaming Yuan
ddf2e68821
Use the new DeviceOrd in the linalg module. (#9527) 2023-08-29 13:37:29 +08:00
Jiaming Yuan
044fea1281
Drop support for loading remote files. (#9504) 2023-08-21 23:34:05 +08:00
Jiaming Yuan
f05a23b41c
Use weakref instead of id for DataIter cache. (#9445)
- Fix case where Python reuses id from freed objects.
- Small optimization to column matrix with QDM by using `realloc` instead of copying data.
2023-08-10 00:40:06 +08:00
Jiaming Yuan
04aff3af8e
Define the new device parameter. (#9362) 2023-07-13 19:30:25 +08:00
Jiaming Yuan
20c52f07d2
Support exporting cut values (#9356) 2023-07-08 15:32:41 +08:00
Jiaming Yuan
645037e376
Improve test coverage with predictor configuration. (#9354)
* Improve test coverage with predictor configuration.

- Test with ext memory.
- Test with QDM.
- Test with dart.
2023-07-05 15:17:22 +08:00
Jiaming Yuan
bc267dd729
Use ptr from mmap for GHistIndexMatrix and ColumnMatrix. (#9315)
* Use ptr from mmap for `GHistIndexMatrix` and `ColumnMatrix`.

- Define a resource for holding various types of memory pointers.
- Define ref vector for holding resources.
- Swap the underlying resources for GHist and ColumnM.
- Add documentation for current status.
- s390x support is removed. It should work if you can compile XGBoost, all the old workaround code does is to get GCC to compile.
2023-06-27 19:05:46 +08:00
Jiaming Yuan
152e2fb072
Unify test helpers for creating ctx. (#9274) 2023-06-10 03:35:22 +08:00
Jiaming Yuan
17fd3f55e9
Optimize adapter element counting on GPU. (#9209)
- Implement a simple `IterSpan` for passing iterators with size.
- Use shared memory for column size counts.
- Use one thread for each sample in row count to reduce atomic operations.
2023-05-30 23:28:43 +08:00
Jiaming Yuan
85988a3178
Wait for data CUDA stream instead of sync. (#9144)
---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2023-05-09 09:52:21 +08:00
Jiaming Yuan
08ce495b5d
Use Booster context in DMatrix. (#8896)
- Pass context from booster to DMatrix.
- Use context instead of integer for `n_threads`.
- Check the consistency configuration for `max_bin`.
- Test for all combinations of initialization options.
2023-04-28 21:47:14 +08:00
Jiaming Yuan
1f9a57d17b
[Breaking] Require format to be specified in input URI. (#9077)
Previously, we use `libsvm` as default when format is not specified. However, the dmlc
data parser is not particularly robust against errors, and the most common type of error
is undefined format.

Along with which, we will recommend users to use other data loader instead. We will
continue the maintenance of the parsers as it's currently used for many internal tests
including federated learning.
2023-04-28 19:45:15 +08:00
Rong Ou
b240f055d3
Support vertical federated learning (#8932) 2023-03-22 14:25:26 +08:00
Jiaming Yuan
36a7396658
Replace dmlc any with std any. (#8892) 2023-03-11 06:11:04 +08:00
Jiaming Yuan
4d665b3fb0
Restore clang tidy test. (#8861) 2023-03-03 13:47:04 -08:00
Jiaming Yuan
c0afdb6786
Fix CPU bin compression with categorical data. (#8809)
* Fix CPU bin compression with categorical data.

* The bug causes the maximum category to be lesser than 256 or the maximum number of bins when
the input data is dense.
2023-02-16 04:20:34 +08:00
Rong Ou
66191e9926
Support cpu quantile sketch with column-wise data split (#8742) 2023-02-05 14:26:24 +08:00
Jiaming Yuan
3760cede0f
Consistent use of context to specify number of threads. (#8733)
- Use context in all tests.
- Use context in R.
- Use context in C API DMatrix initialization. (0 threads is used as dft).
2023-01-30 15:25:31 +08:00
Jiaming Yuan
31b9cbab3d
Make sure input numpy array is aligned. (#8690)
- use `np.require` to specify that the alignment is required.
- scipy csr as well.
- validate input pointer in `ArrayInterface`.
2023-01-18 08:12:13 +08:00
Jiaming Yuan
07cf3d3e53
Fix threads in DMatrix slice. (#8667) 2023-01-14 07:16:57 +08:00
Rong Ou
3ceeb8c61c
Add data split mode to DMatrix MetaInfo (#8568) 2022-12-25 20:37:37 +08:00
Jiaming Yuan
3e26107a9c
Rename and extract Context. (#8528)
* Rename `GenericParameter` to `Context`.
* Rename header file to reflect the change.
* Rename all references.
2022-12-07 04:58:54 +08:00
Rong Ou
78d65a1928
Initial support for column-wise data split (#8468) 2022-12-04 01:37:51 +08:00
Jiaming Yuan
addaa63732
Support null value in CUDA array interface. (#8486)
* Support null value in CUDA array interface.

- Fix for potential null value in array interface.
- Fix incorrect check on mask stride.

* Simple tests.

* Extract mask.
2022-11-28 17:48:25 -08:00
Rong Ou
8e76f5f595
Use DataSplitMode to configure data loading (#8434)
* Use `DataSplitMode` to configure data loading
2022-11-08 16:21:50 +08:00
Jiaming Yuan
bc818316f2
Prepare for improving Windows networking compatibility. (#8234)
* Prepare for improving Windows networking compatibility.

* Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which
  conflicts with winsock2.h
* Define `NOMINMAX` conditionally.
* Link the winsock library when mysys32 is used.
* Add config file for read the doc.
2022-09-10 15:16:49 +08:00
Jiaming Yuan
441ffc017a
Copy data from Ellpack to GHist. (#8215) 2022-09-06 23:05:49 +08:00
Jiaming Yuan
16bca5d4a1
Support CPU input for device QuantileDMatrix. (#8136)
- Copy `GHistIndexMatrix` to `Ellpack` when needed.
2022-08-11 21:21:26 +08:00
Jiaming Yuan
2c70751d1e
Implement iterative DMatrix for CPU. (#8116) 2022-07-26 22:34:21 +08:00
Jiaming Yuan
4a4e5c7c18
Prepare gradient index for Quantile DMatrix. (#8103)
* Prepare gradient index for Quantile DMatrix.

- Implement push batch with adapter batch.
- Implement `GetFvalue` for prediction.
2022-07-22 17:26:33 +08:00