[doc] Add missing document for pyspark ranker. [skip ci] (#8692)

This commit is contained in:
Jiaming Yuan 2023-01-18 07:52:18 +08:00 committed by GitHub
parent 78396f8a6e
commit 175986b739
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 16 additions and 5 deletions

View File

@ -173,3 +173,13 @@ PySpark API
:members: :members:
:inherited-members: :inherited-members:
:show-inheritance: :show-inheritance:
.. autoclass:: xgboost.spark.SparkXGBRanker
:members:
:inherited-members:
:show-inheritance:
.. autoclass:: xgboost.spark.SparkXGBRankerModel
:members:
:inherited-members:
:show-inheritance:

View File

@ -45,7 +45,7 @@ such as ``weight_col``, ``validation_indicator_col``, ``use_gpu``, for details p
The following code snippet shows how to train a spark xgboost regressor model, The following code snippet shows how to train a spark xgboost regressor model,
first we need to prepare a training dataset as a spark dataframe contains first we need to prepare a training dataset as a spark dataframe contains
"label" column and "features" column(s), the "features" column(s) must be ``pyspark.ml.linalg.Vector` "label" column and "features" column(s), the "features" column(s) must be ``pyspark.ml.linalg.Vector``
type or spark array type or a list of feature column names. type or spark array type or a list of feature column names.
@ -56,7 +56,7 @@ type or spark array type or a list of feature column names.
The following code snippet shows how to predict test data using a spark xgboost regressor model, The following code snippet shows how to predict test data using a spark xgboost regressor model,
first we need to prepare a test dataset as a spark dataframe contains first we need to prepare a test dataset as a spark dataframe contains
"features" and "label" column, the "features" column must be ``pyspark.ml.linalg.Vector` "features" and "label" column, the "features" column must be ``pyspark.ml.linalg.Vector``
type or spark array type. type or spark array type.
.. code-block:: python .. code-block:: python
@ -97,7 +97,7 @@ Aside from the PySpark and XGBoost modules, we also need the `cuDF
<https://docs.rapids.ai/api/cudf/stable/>`_ package for handling Spark dataframe. We <https://docs.rapids.ai/api/cudf/stable/>`_ package for handling Spark dataframe. We
recommend using either Conda or Virtualenv to manage python dependencies for PySpark recommend using either Conda or Virtualenv to manage python dependencies for PySpark
jobs. Please refer to `How to Manage Python Dependencies in PySpark jobs. Please refer to `How to Manage Python Dependencies in PySpark
<https://www.databricks.com/blog/2020/12/22/how-to-manage-python-dependencies-in-pyspark.html>`_ <https://www.databricks.com/blog/2020/12/22/how-to-manage-python-dependencies-in-pyspark.html>`_
for more details on PySpark dependency management. for more details on PySpark dependency management.
In short, to create a Python environment that can be sent to a remote cluster using In short, to create a Python environment that can be sent to a remote cluster using

View File

@ -1,5 +1,4 @@
"""PySpark XGBoost integration interface """PySpark XGBoost integration interface"""
"""
try: try:
import pyspark import pyspark
@ -10,6 +9,7 @@ from .estimator import (
SparkXGBClassifier, SparkXGBClassifier,
SparkXGBClassifierModel, SparkXGBClassifierModel,
SparkXGBRanker, SparkXGBRanker,
SparkXGBRankerModel,
SparkXGBRegressor, SparkXGBRegressor,
SparkXGBRegressorModel, SparkXGBRegressorModel,
) )
@ -20,4 +20,5 @@ __all__ = [
"SparkXGBRegressor", "SparkXGBRegressor",
"SparkXGBRegressorModel", "SparkXGBRegressorModel",
"SparkXGBRanker", "SparkXGBRanker",
"SparkXGBRankerModel",
] ]