Jiaming Yuan
3901f5d9db
[pyspark] Cleanup data processing. ( #8344 )
...
* Enable additional combinations of ctor parameters.
* Unify procedures for QuantileDMatrix and DMatrix.
2022-10-18 14:56:23 +08:00
Jiaming Yuan
97a5b088a5
[pyspark] Use quantile dmatrix. ( #8284 )
2022-10-12 20:38:53 +08:00
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix ( #8245 )
...
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
WeichenXu
53d2a733b0
[pyspark] Make Xgboost estimator support using sparse matrix as optimization ( #8145 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-19 01:57:28 +08:00
Bobby Wang
03cc3b359c
[pyspark] support a list of feature column names ( #8117 )
2022-08-08 17:05:27 +08:00
Jiaming Yuan
546de5efd2
[pyspark] Cleanup data processing. ( #8088 )
...
- Use numpy stack for handling list of arrays.
- Reuse concat function from dask.
- Prepare for `QuantileDMatrix`.
- Remove unused code.
- Use iterator for prediction to avoid initializing xgboost model
2022-07-26 15:00:52 +08:00
WeichenXu
176fec8789
PySpark XGBoost integration ( #8020 )
...
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2022-07-13 13:11:18 +08:00