[dask] Add scheduler address to dask config. (#7581)

- Add user configuration.
- Bring back to the logic of using scheduler address from dask.  This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.
This commit is contained in:
Jiaming Yuan
2022-01-22 01:56:32 +08:00
committed by GitHub
parent 5ddd4a9d06
commit ef4dae4c0e
6 changed files with 136 additions and 24 deletions

View File

@@ -475,6 +475,32 @@ interface, including callback functions, custom evaluation metric and objective:
)
.. _tracker-ip:
***************
Tracker Host IP
***************
.. versionadded:: 1.6.0
In some environments XGBoost might fail to resolve the IP address of the scheduler, a
symptom is user receiving ``OSError: [Errno 99] Cannot assign requested address`` error
during training. A quick workaround is to specify the address explicitly. To do that
dask config is used:
.. code-block:: python
import dask
from distributed import Client
from xgboost import dask as dxgb
# let xgboost know the scheduler address
dask.config.set({"xgboost.scheduler_address": "192.0.0.100"})
with Client(scheduler_file="sched.json") as client:
reg = dxgb.DaskXGBRegressor()
XGBoost will read configuration before training.
*****************************************************************************
Why is the initialization of ``DaskDMatrix`` so slow and throws weird errors
*****************************************************************************