[dask][doc] Add small example for sklearn interface. (#6970)

2021-05-19 13:50:45 +08:00 · 2021-05-19 13:50:45 +08:00 · 5cb51a191e
commit 5cb51a191e
parent 7e846bb965
2 changed files with 36 additions and 4 deletions
--- a/doc/tutorials/dask.rst
+++ b/doc/tutorials/dask.rst
@ -115,8 +115,8 @@ See next section for details.
 Alternatively, XGBoost also implements the Scikit-Learn interface with
 ``DaskXGBClassifier``, ``DaskXGBRegressor``, ``DaskXGBRanker`` and 2 random forest
 variances.  This wrapper is similar to the single node Scikit-Learn interface in xgboost,
-with dask collection as inputs and has an additional ``client`` attribute.  See
-``xgboost/demo/dask`` for more examples.
+with dask collection as inputs and has an additional ``client`` attribute.  See following
+sections and ``xgboost/demo/dask`` for more examples.


 ******************
@ -191,6 +191,38 @@ Scikit-Learn wrapper object:
    booster = cls.get_booster()


+**********************
+Scikit-Learn interface
+**********************
+
+As mentioned previously, there's another interface that mimics the scikit-learn estimators
+with higher level of of abstraction.  The interface is easier to use compared to the
+functional interface but with more constraints.  It's worth mentioning that, although the
+interface mimics scikit-learn estimators, it doesn't work with normal scikit-learn
+utilities like ``GridSearchCV`` as scikit-learn doesn't understand distributed dask data
+collection.
+
+
+.. code-block:: python
+
+    from distributed import LocalCluster, Client
+    import xgboost as xgb
+
+
+    def main(client: Client) -> None:
+        X, y = load_data()
+        clf = xgb.dask.DaskXGBClassifier(n_estimators=100, tree_method="hist")
+        clf.client = client  # assign the client
+        clf.fit(X, y, eval_set=[(X, y)])
+        proba = clf.predict_proba(X)
+
+
+    if __name__ == "__main__":
+        with LocalCluster() as cluster:
+            with Client(cluster) as client:
+                main(client)
+
+
 ***************************
 Working with other clusters
 ***************************
--- a/doc/tutorials/rf.rst
+++ b/doc/tutorials/rf.rst
@ -1,6 +1,6 @@
-#########################
+#############################
 Random Forests(TM) in XGBoost
-#########################
+#############################

 XGBoost is normally used to train gradient-boosted decision trees and other gradient
 boosted models. Random Forests use the same model representation and inference, as