Jiaming Yuan a5a58102e5
Revamp the rabit implementation. (#10112)
This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features:
- Federated learning for both CPU and GPU.
- NCCL.
- More data types.
- A unified interface for all the underlying implementations.
- Improved timeout handling for both tracker and workers.
- Exhausted tests with metrics (fixed a couple of bugs along the way).
- A reusable tracker for Python and JVM packages.
2024-05-20 11:56:23 +08:00
..

XGBoost Plugin for Federated Learning

This folder contains the plugin for federated learning. Follow these steps to build and test it.

Install gRPC

Refer to the installation guide from the gRPC website.

Build the Plugin

# Under xgboost source tree.
mkdir build
cd build
cmake .. -GNinja \
 -DPLUGIN_FEDERATED=ON \
 -DUSE_CUDA=ON\
 -DUSE_NCCL=ON
ninja
cd ../python-package
pip install -e .

If CMake fails to locate gRPC, you may need to pass -DCMAKE_PREFIX_PATH=<grpc path> to CMake.

Test Federated XGBoost

# Under xgboost source tree.
cd tests/distributed
# This tests both CPU training (`hist`) and GPU training (`gpu_hist`).
./runtests-federated.sh