16 Commits

Author SHA1 Message Date
Jiaming Yuan
10bb0a74ef
[backport] [CI] Skip pyspark sparse tests. (#8675) (#8678) 2023-01-14 06:40:17 +08:00
Jiaming Yuan
60a8c8ebba
[pyspark] sort qid for SparkRanker (#8497) (#8555)
* [pyspark] sort qid for SparkRandker

* resolve comments

Co-authored-by: Bobby Wang <wbo4958@gmail.com>
2022-12-07 02:07:37 +08:00
Bobby Wang
76f95a6667
[pyspark] Filter out the unsupported train parameters (#8355) 2022-10-18 23:26:02 +08:00
Jiaming Yuan
2176e511fc
Disable pytest-timeout for now. (#8348) 2022-10-17 23:06:10 +08:00
Jiaming Yuan
97a5b088a5
[pyspark] Use quantile dmatrix. (#8284) 2022-10-12 20:38:53 +08:00
Rory Mitchell
ce0382dcb0
[CI] Refactor tests to reduce CI time. (#8312) 2022-10-12 11:32:06 +02:00
WeichenXu
ff71c69adf
[pyspark] Add validation for param 'early_stopping_rounds' and 'validation_indicator_col' (#8250)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-09-26 17:43:03 +08:00
Bobby Wang
4f42aa5f12
[pyspark] make the model saved by pyspark compatible (#8219)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:49 +08:00
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix (#8245)
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
WeichenXu
d03794ce7a
[pyspark] Add param validation for "objective" and "eval_metric" param, and remove invalid booster params (#8173)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-24 15:29:43 +08:00
WeichenXu
f4628c22a4
[pyspark] Implement SparkXGBRanker estimator (#8172)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-23 02:35:19 +08:00
WeichenXu
53d2a733b0
[pyspark] Make Xgboost estimator support using sparse matrix as optimization (#8145)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-19 01:57:28 +08:00
Jiaming Yuan
570f8ae4ba
Use black on more Python files. (#8137) 2022-08-11 01:38:11 +08:00
Jiaming Yuan
546de5efd2
[pyspark] Cleanup data processing. (#8088)
- Use numpy stack for handling list of arrays.
- Reuse concat function from dask.
- Prepare for `QuantileDMatrix`.
- Remove unused code.
- Use iterator for prediction to avoid initializing xgboost model
2022-07-26 15:00:52 +08:00
Bobby Wang
f801d3cf15
[PySpark] change the returning model type to string from binary (#8085)
* [PySpark] change the returning model type to string from binary

XGBoost pyspark can be can be accelerated by RAPIDS Accelerator seamlessly by
changing the returning model type from binary to string.
2022-07-19 18:39:20 +08:00
WeichenXu
176fec8789
PySpark XGBoost integration (#8020)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2022-07-13 13:11:18 +08:00