Jiaming Yuan
4e5a7729c3
Fix lint errors. ( #9634 )
2023-10-09 19:04:31 +08:00
Jiaming Yuan
60526100e3
Support arrow through pandas ext types. ( #9612 )
...
- Use pandas extension type for pyarrow support.
- Additional support for QDM.
- Additional support for inplace_predict.
2023-09-28 17:00:16 +08:00
Jiaming Yuan
c75a3bc0a9
[breaking] [jvm-packages] Remove rabit check point. ( #9599 )
...
- Add `numBoostedRound` to jvm packages
- Remove rabit checkpoint version.
- Change the starting version of training continuation in JVM [breaking].
- Redefine the checkpoint version policy in jvm package. [breaking]
- Rename the Python check point callback parameter. [breaking]
- Unifies the checkpoint policy between Python and JVM.
2023-09-26 18:06:34 +08:00
Jiaming Yuan
a90d204942
Use array interface for testing numpy arrays. ( #9602 )
2023-09-23 03:13:48 +08:00
Jiaming Yuan
bbf5b9ee57
[dask] Move dask module into directory. ( #9597 )
2023-09-23 01:28:18 +08:00
Jiaming Yuan
9027686cac
Support pandas 2.1.0. ( #9557 )
2023-09-11 17:44:51 +08:00
Bobby Wang
6c791b5b47
[pyspark] support gpu transform ( #9542 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-09-07 12:15:50 +08:00
Bobby Wang
419e052314
[pyspark] rework transform to reuse same code ( #9292 )
2023-09-04 15:57:16 +08:00
Jiaming Yuan
ccfc90e4c6
[rabit] Improved connection handling. ( #9531 )
...
- Enable timeout.
- Report connection error from the system.
- Handle retry for both tracker connection and peer connection.
2023-08-30 13:00:04 +08:00
Jiaming Yuan
1b87a1d8f8
[rabit] Small cleanup to tracker initialization. ( #9524 )
...
- Remove recover related code.
- Clean startup, no need to consider previously connected nodes.
2023-08-27 05:10:59 +08:00
Jiaming Yuan
209335b18c
Remove the deprecated Python rabit module. ( #9523 )
2023-08-27 03:37:05 +08:00
Jiaming Yuan
aa86bd5207
[dask] Filter models on worker. ( #9518 )
2023-08-25 20:23:47 +08:00
Jiaming Yuan
972730cde0
Use matrix for gradient. ( #9508 )
...
- Use the `linalg::Matrix` for storing gradients.
- New API for the custom objective.
- Custom objective for multi-class/multi-target is now required to return the correct shape.
- Custom objective for Python can accept arrays with any strides. (row-major, column-major)
2023-08-24 05:29:52 +08:00
Jiaming Yuan
044fea1281
Drop support for loading remote files. ( #9504 )
2023-08-21 23:34:05 +08:00
Jiaming Yuan
7f29a238e6
Return base score as intercept. ( #9486 )
2023-08-19 12:28:02 +08:00
Jiaming Yuan
58530b1bc4
Bump version to 2.1. ( #9498 )
2023-08-18 01:04:04 +08:00
Bobby Wang
68be454cfa
[pyspark] hotfix for GPU setup validation ( #9495 )
...
* [pyspark] fix a bug of validating gpu configuration
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-08-17 16:01:39 +08:00
Jiaming Yuan
5188e27513
Fix version parsing with rc release. ( #9493 )
2023-08-16 22:44:58 +08:00
Jiaming Yuan
bdc1a3c178
Fix pyspark parameter. ( #9460 )
...
- Don't pass the `use_gpu` parameter to the learner.
- Fix GPU approx with PySpark.
2023-08-11 19:07:50 +08:00
Jiaming Yuan
1caa93221a
Use realloc for histogram cache and expose the cache limit. ( #9455 )
2023-08-10 14:05:27 +08:00
Jiaming Yuan
f05a23b41c
Use weakref instead of id for DataIter cache. ( #9445 )
...
- Fix case where Python reuses id from freed objects.
- Small optimization to column matrix with QDM by using `realloc` instead of copying data.
2023-08-10 00:40:06 +08:00
Bobby Wang
d495a180d8
[pyspark] add logs for training ( #9449 )
2023-08-09 18:32:23 +08:00
Jiaming Yuan
54029a59af
Bound the size of the histogram cache. ( #9440 )
...
- A new histogram collection with a limit in size.
- Unify histogram building logic between hist, multi-hist, and approx.
2023-08-08 03:21:26 +08:00
Hendrik Makait
f958e32683
Raise if expected workers are not alive in xgboost.dask.train ( #9421 )
2023-08-03 20:14:07 +08:00
Jiaming Yuan
7129988847
Accept only keyword arguments in data iterator. ( #9431 )
2023-08-03 12:44:16 +08:00
Jiaming Yuan
912e341d57
Initial GPU support for the approx tree method. ( #9414 )
2023-07-31 15:50:28 +08:00
Jiaming Yuan
851cba931e
Define best_iteration only if early stopping is used. ( #9403 )
...
* Define `best_iteration` only if early stopping is used.
This is the behavior specified by the document but not honored in the actual code.
- Don't set the attributes if there's no early stopping.
- Clean up the code for callbacks, and replace assertions with proper exceptions.
- Assign the attributes when early stopping `save_best` is used.
- Turn the attributes into Python properties.
---------
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2023-07-24 12:43:35 +08:00
Jiaming Yuan
01e00efc53
[breaking] Remove support for single string feature info. ( #9401 )
...
- Input must be a sequence of strings.
- Improve validation error message.
2023-07-24 11:06:30 +08:00
Jiaming Yuan
275da176ba
Document for device ordinal. ( #9398 )
...
- Rewrite GPU demos. notebook is converted to script to avoid committing additional png plots.
- Add GPU demos into the sphinx gallery.
- Add RMM demos into the sphinx gallery.
- Test for firing threads with different device ordinals.
2023-07-22 15:26:29 +08:00
Jiaming Yuan
6e18d3a290
[pyspark] Handle the device parameter in pyspark. ( #9390 )
...
- Handle the new `device` parameter in PySpark.
- Deprecate the old `use_gpu` parameter.
2023-07-18 08:47:03 +08:00
Jiaming Yuan
b342ef951b
Make feature validation immutable. ( #9388 )
2023-07-16 06:52:55 +08:00
Jiaming Yuan
16eb41936d
Handle the new device parameter in dask and demos. ( #9386 )
...
* Handle the new `device` parameter in dask and demos.
- Check no ordinal is specified in the dask interface.
- Update demos.
- Update dask doc.
- Update the condition for QDM.
2023-07-15 19:11:20 +08:00
Jiaming Yuan
9da5050643
Turn warning messages into Python warnings. ( #9387 )
2023-07-15 07:46:43 +08:00
Jiaming Yuan
04aff3af8e
Define the new device parameter. ( #9362 )
2023-07-13 19:30:25 +08:00
Jiaming Yuan
20c52f07d2
Support exporting cut values ( #9356 )
2023-07-08 15:32:41 +08:00
edumugi
c3124813e8
Support numpy vertical split ( #9365 )
2023-07-08 13:18:12 +08:00
Oliver Holworthy
6c9c8a9001
Enable Installation of Python Package with System lib in a Virtual Environment ( #9349 )
2023-07-05 05:46:17 +08:00
Jiaming Yuan
e964654b8f
[skl] Enable cat feature without specifying tree method. ( #9353 )
2023-07-03 22:06:17 +08:00
Jiaming Yuan
39390cc2ee
[breaking] Remove the predictor param, allow fallback to prediction using DMatrix. ( #9129 )
...
- A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter.
- The `predictor` parameter is removed.
- Fallback to `DMatrix` when `inplace_predict` is not available.
- The heuristic for choosing a predictor is only used during training.
2023-07-03 19:23:54 +08:00
Jiaming Yuan
4066d68261
[doc] Clarify early stopping. ( #9304 )
2023-06-20 17:56:47 +08:00
Jiaming Yuan
ee6809e642
Use mmap for external memory. ( #9282 )
...
- Have basic infrastructure for mmap.
- Release file write handle.
2023-06-19 18:52:55 +08:00
Jiaming Yuan
ea0deeca68
Disable dense optimization in hist for distributed training. ( #9272 )
2023-06-10 02:31:34 +08:00
Jiaming Yuan
1fcc26a6f8
Set ndcg to default for LTR. ( #8822 )
...
- Add document.
- Add tests.
- Use `ndcg` with `topk` as default.
2023-06-09 23:31:33 +08:00
Jiaming Yuan
9fbde21e9d
Rework the precision metric. ( #9222 )
...
- Rework the precision metric for both CPU and GPU.
- Mention it in the document.
- Cleanup old support code for GPU ranking metric.
- Deterministic GPU implementation.
* Drop support for classification.
* type.
* use batch shape.
* lint.
* cpu build.
* cpu build.
* lint.
* Tests.
* Fix.
* Cleanup error message.
2023-06-02 20:49:43 +08:00
Jiaming Yuan
097f11b6e0
Support CUDA f16 without transformation. ( #9207 )
...
- Support f16 from cupy.
- Include CUDA header explicitly.
- Cleanup cmake nvtx support.
2023-05-30 20:54:31 +08:00
Bobby Wang
320323f533
[pyspark] add parameters in the ctor of all estimators. ( #9202 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-05-29 05:58:16 +08:00
michael-gendy-mention-me
c5677a2b2c
Remove type: ignore hints ( #9197 )
2023-05-27 07:48:28 +08:00
Jiaming Yuan
3913ff470f
Import data lazily during tests. ( #9176 )
2023-05-23 03:58:31 +08:00
Bobby Wang
6274fba0a5
[pyspark] support tying ( #9172 )
2023-05-19 14:39:26 +08:00
Bobby Wang
caf326d508
[pyspark] Refactor and typing support for models ( #9156 )
2023-05-17 16:38:51 +08:00