* apply openmp simd * clean __buildin detection, moving windows build check from xgboost project, add openmp support for vectorize reduce * apply openmp only to rabit * orgnize rabit signature * remove is_bootstrap, use load_checkpoint as implict flag * visual studio don't support latest openmp * orgnize omp declarations * replace memory copy with vector cast * Revert "replace memory copy with vector cast" This reverts commit 28de4792dcdff40d83d458510d23b7ef0b191d79. * Revert "orgnize omp declarations" This reverts commit 31341233d31ce93ccf34d700262b1f3f6690bbfe. * remove openmp settings, merge into a upcoming pr * mis * per feedback, update comments
2.6 KiB
2.6 KiB
Rabit: Reliable Allreduce and Broadcast Interface
rabit is a light weight library that provides a fault tolerant interface of Allreduce and Broadcast. It is designed to support easy implementations of distributed machine learning programs, many of which fall naturally under the Allreduce abstraction. The goal of rabit is to support portable , scalable and reliable distributed machine learning programs.
- Tutorial
- API Documentation
- You can also directly read the interface header
- XGBoost
- Rabit is one of the backbone library to support distributed XGBoost
Features
All these features comes from the facts about small rabbit:)
- Portable: rabit is light weight and runs everywhere
- Rabit is a library instead of a framework, a program only needs to link the library to run
- Rabit only replies on a mechanism to start program, which was provided by most framework
- You can run rabit programs on many platforms, including Yarn(Hadoop), MPI using the same code
- Scalable and Flexible: rabit runs fast
- Rabit program use Allreduce to communicate, and do not suffer the cost between iterations of MapReduce abstraction.
- Programs can call rabit functions in any order, as opposed to frameworks where callbacks are offered and called by the framework, i.e. inversion of control principle.
- Programs persist over all the iterations, unless they fail and recover.
- Reliable: rabit dig burrows to avoid disasters
- Rabit programs can recover the model and results using synchronous function calls.
- Rabit programs can set rabit_boostrap_cache=1 to support allreduce/broadcast operations before loadcheckpoint
rabit::Init(); -> rabit::AllReduce(); -> rabit::loadCheckpoint(); -> for () { rabit::AllReduce(); rabit::Checkpoint();} -> rabit::Shutdown();
Use Rabit
- Type make in the root folder will compile the rabit library in lib folder
- Add lib to the library path and include to the include path of compiler
- Languages: You can use rabit in C++ and python
- It is also possible to port the library to other languages
Contributing
Rabit is an open-source library, contributions are welcomed, including:
- The rabit core library.
- Customized tracker script for new platforms and interface of new languages.
- Tutorial and examples about the library.