202 Commits

Author SHA1 Message Date
david-cortes
caabee2135
[R] remove 'reshape' argument, let shapes be handled by core cpp library (#10330) 2024-08-18 23:31:38 +08:00
Jiaming Yuan
7bccc1ea2c
[EM] CPU implementation for external memory QDM. (#10682)
- A new DMatrix type.
- Extract common code into a new QDM base class.

Not yet working:
- Not exposed to the interface yet, will wait for the GPU implementation.
- ~No meta info yet, still working on the source.~
- Exporting data to CSR is not supported yet.
2024-08-09 09:38:02 +08:00
Jiaming Yuan
292bb677e5
[EM] Support mmap backed ellpack. (#10602)
- Support resource view in ellpack.
- Define the CUDA version of MMAP resource.
- Define the CUDA version of malloc resource.
- Refactor cuda runtime API wrappers, and add memory access related wrappers.
- gather windows macros into a single header.
2024-07-18 08:20:21 +08:00
Jiaming Yuan
e8a962575a
[EM] Allow staging ellpack on host for GPU external memory. (#10488)
- New parameter `on_host`.
- Abstract format creation and stream creation into policy classes.
2024-06-28 04:42:18 +08:00
david-cortes
61ac8eec8a
[R] use Rf_ prefix for R C functions. (#10465) 2024-06-21 14:37:18 +08:00
Jiaming Yuan
e6eefea5e2
[coll] Move the rabit poll helper. (#10349) 2024-05-31 08:02:21 +08:00
Jiaming Yuan
a5a58102e5
Revamp the rabit implementation. (#10112)
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
Jiaming Yuan
3f64b4fde3
[coll] Add global functions. (#10203) 2024-04-19 03:17:23 +08:00
Jiaming Yuan
4b10200456
[coll] Improve event loop. (#10199)
- Add a test for blocking calls.
- Do not require the queue to be empty after waking up; this frees up the thread to answer blocking calls.
- Handle EOF in read.
- Improve the error message in the result. Allow concatenation of multiple results.
2024-04-18 03:29:52 +08:00
david-cortes
bc9ea62ec0
[R] Make xgb.cv work with xgb.DMatrix only, adding support for survival and ranking fields (#10031)
---------

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2024-03-31 21:53:00 +08:00
david-cortes
2c12b956da
[R] Refactor callback structure and attributes (#9957) 2024-03-01 15:57:47 +08:00
Jiaming Yuan
0ce4372bd4
Use UBJSON for serializing splits for vertical data split. (#10059) 2024-02-25 00:18:23 +08:00
david-cortes
f7005d32c1
[R] Use inplace predict (#9829)
---------

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2024-02-24 02:03:54 +08:00
david-cortes
4de866211d
[R] switch to URI reader (#10024) 2024-02-05 05:03:38 +08:00
david-cortes
a730c7e67e
[R] allow using seed with regular RNG (#10029) 2024-02-04 16:22:22 +08:00
david-cortes
3abbbe41ac
[R] Add data iterator, quantile dmatrix, external memory, and missing feature_types (#9913) 2024-01-30 19:26:44 +08:00
david-cortes
5062a3ab46
[R] Support booster slicing. (#9948) 2024-01-21 05:11:26 +08:00
david-cortes
60b9d2eeb9
[R] Avoid memory copies in predict (#9902) 2024-01-21 00:53:18 +08:00
david-cortes
d3a8d284ab
[R] On-demand serialization + standardization of attributes (#9924)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-01-11 05:08:42 +08:00
david-cortes
7ff6d44efa
[R] Use R's error stream for printing warnings (#9965) 2024-01-09 03:43:21 +08:00
david-cortes
db396ee340
[R] make sure output fits into int32 (#9949) 2024-01-04 16:51:22 +08:00
david-cortes
3c004a4145
[R] Add missing DMatrix functions (#9929)
* `XGDMatrixGetQuantileCut`
* `XGDMatrixNumNonMissing`
* `XGDMatrixGetDataAsCSR`

---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2024-01-03 17:29:21 +08:00
david-cortes
a197899161
[R] avoid leaking exception objects (#9916) 2023-12-26 20:29:55 +08:00
david-cortes
ae32936ba2
[R] Catch C++ exceptions (#9903) 2023-12-19 10:45:03 +08:00
Jiaming Yuan
1c6e031c75
[R] Fix clang warning. (#9874) 2023-12-15 01:30:43 +08:00
Jiaming Yuan
faf0f2df10
Support dataframe data format in native XGBoost. (#9828)
- Implement a columnar adapter.
- Refactor Python pandas handling code to avoid converting into a single numpy array.
- Add support in R for transforming columns.
- Support R data.frame and factor type.
2023-12-12 09:56:31 +08:00
david-cortes
562352101d
[R] Move all DMatrix fields to function arguments (#9862) 2023-12-10 02:45:28 +08:00
david-cortes
0716c64ef7
[R] Error out on multidimensional arrays (#9852) 2023-12-06 17:43:51 +08:00
david-cortes
62571b79eb
[R] Enable multi-output objectives (#9839) 2023-12-06 03:13:14 +08:00
david-cortes
9c56916fd7
[R] Very small performance tweaks (#9837) 2023-12-04 18:40:45 +08:00
david-cortes
7196c9d95e
[R] Fix memory safety issues (#9823) 2023-12-02 13:43:50 +08:00
david-cortes
95af5c074b
more usage of array interface, fix potential memory leaks of std::string (#9824) 2023-12-01 00:06:59 +08:00
david-cortes
37da66f865
[R] Use array interface for dense DMatrix creation (#9816)
---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-11-30 04:35:05 +08:00
david-cortes
c0ef2f8dce
[R] Fix potential memory leaks in case of R allocation failures (#9817) 2023-11-29 13:14:17 +08:00
Jiaming Yuan
b771f58453
[coll] Define interface for bridging. (#9695)
* Define the basic interface that will shared by nccl, federated and native.
2023-10-20 16:20:48 +08:00
Jiaming Yuan
48ac9b6cbe
[coll] Allreduce. (#9679) 2023-10-17 13:57:14 +08:00
Jiaming Yuan
53049b16b8
[coll] Broadcast. (#9659) 2023-10-14 09:34:37 +08:00
Jiaming Yuan
946ae1c440
[coll] Implement a new tracker and a communicator. (#9650)
* [coll] Implement a new tracker and a communicator.

The new tracker and communicators communicate through the use of JSON documents. Along
with which, communicators are aware of each other.
2023-10-12 12:49:16 +08:00
Jiaming Yuan
d95be1c38d
Small cleanup to jvm iter adapter. (#9616)
- Remove header dependency on c_api
- Remove remaining code for arrow.
2023-09-29 00:39:07 +08:00
James Lamb
730bc1f688
[R] remove unused headers (#9546) 2023-09-14 17:11:26 +08:00
James Lamb
d159ee8547
[R] reformat build scripts (#9540) 2023-09-04 17:40:46 +08:00
Jiaming Yuan
be6a552956
[R] Support multi-class custom objective. (#9526) 2023-08-29 08:27:13 +08:00
Jiaming Yuan
c3574d932f
[R] Fix integer inputs with NA. (#9522) 2023-08-28 18:36:11 +08:00
Jiaming Yuan
972730cde0
Use matrix for gradient. (#9508)
- Use the `linalg::Matrix` for storing gradients.
- New API for the custom objective.
- Custom objective for multi-class/multi-target is now required to return the correct shape.
- Custom objective for Python can accept arrays with any strides. (row-major, column-major)
2023-08-24 05:29:52 +08:00
Jiaming Yuan
bb56183396
Normalize file system path. (#9463) 2023-08-11 21:26:46 +08:00
Jiaming Yuan
54029a59af
Bound the size of the histogram cache. (#9440)
- A new histogram collection with a limit in size.
- Unify histogram building logic between hist, multi-hist, and approx.
2023-08-08 03:21:26 +08:00
Jiaming Yuan
e93a274823
Small cleanup for histogram routines. (#9427)
* Small cleanup for histogram routines.

- Extract hist train param from GPU hist.
- Make histogram const after construction.
- Unify parameter names.
2023-08-02 18:28:26 +08:00
Jiaming Yuan
97ed944209
Unify the hist tree method for different devices. (#9363) 2023-07-11 10:04:39 +08:00
Jiaming Yuan
e206b899ef
Rework MAP and Pairwise for LTR. (#9075) 2023-04-28 02:39:12 +08:00
Jiaming Yuan
ef13dd31b1
Rework the NDCG objective. (#9015) 2023-04-18 21:16:06 +08:00