5 Commits

Author SHA1 Message Date
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix (#8245)
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
WeichenXu
53d2a733b0
[pyspark] Make Xgboost estimator support using sparse matrix as optimization (#8145)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-19 01:57:28 +08:00
Bobby Wang
03cc3b359c
[pyspark] support a list of feature column names (#8117) 2022-08-08 17:05:27 +08:00
Jiaming Yuan
546de5efd2
[pyspark] Cleanup data processing. (#8088)
- Use numpy stack for handling list of arrays.
- Reuse concat function from dask.
- Prepare for `QuantileDMatrix`.
- Remove unused code.
- Use iterator for prediction to avoid initializing xgboost model
2022-07-26 15:00:52 +08:00
WeichenXu
176fec8789
PySpark XGBoost integration (#8020)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2022-07-13 13:11:18 +08:00