2014-12-04 09:09:29 -08:00
ok
2014-12-03 21:53:34 -08:00
2014-12-03 15:04:30 -08:00
2014-12-04 09:05:48 -08:00
2014-12-03 21:30:11 -08:00
2014-12-04 09:09:29 -08:00

rabit: Robust Allreduce and Broadcast Interface

rabit is a light weight library that provides a fault tolerant interface of Allreduce and Broadcast. It is designed to support easy implementation of distributed machine learning programs, many of which sits naturally under Allreduce abstraction.

Contributors: https://github.com/tqchen/rabit/graphs/contributors

Design Note

  • Rabit is designed for algorithms that replicate same global model across nodes, while each node operating on local parition of data.
  • The global statistics collection is done by using Allreduce

Design Goal

  • rabit should run fast
  • rabit is light weight
  • rabit dig safe burrows to avoid disasters

Features

  • Portable library
    • Rabit is a library instead of framework, program only need to link the library to run, without restricting to a single framework.
  • Flexibility in programming
    • Programs call rabit functions in any sequence, as opposed to defines limited functions and being called.
    • Program persist over all the iterations, unless it fails and recover
  • Fault tolerance
    • Rabit program can recover model and results of syncrhonization functions calls
  • MPI compatible
    • Codes using rabit interface naturally compiles with existing MPI compiler
    • User can fall back to use MPI Allreduce if they like with no code modification
Description
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Readme 33 MiB
Languages
C++ 45.5%
Python 20.3%
Cuda 15.2%
R 6.8%
Scala 6.4%
Other 5.6%