tqchen 57b5d7873f Squashed 'subtree/rabit/' changes from d4ec037..28ca7be
28ca7be add linear readme
ca4b20f add linear readme
1133628 add linear readme
6a11676 update docs
a607047 Update build.sh
2c1cfd8 complete yarn
4f28e32 change formater
2fbda81 fix stdin input
3258bcf checkin yarn master
67ebf81 allow setup from env variables
9b6bf57 fix hdfs
395d5c2 add make system
88ce767 refactor io, initial hdfs file access need test
19be870 chgs
a1bd3c6 Merge branch 'master' of ssh://github.com/tqchen/rabit
1a573f9 introduce input split
29476f1 fix timer issue

git-subtree-dir: subtree/rabit
git-subtree-split: 28ca7becbdf6503e6b1398588a969efb164c9701
2015-03-09 13:28:38 -07:00

rabit: Reliable Allreduce and Broadcast Interface

rabit is a light weight library that provides a fault tolerant interface of Allreduce and Broadcast. It is designed to support easy implementations of distributed machine learning programs, many of which fall naturally under the Allreduce abstraction. The goal of rabit is to support portable , scalable and reliable distributed machine learning programs.

Features

All these features comes from the facts about small rabbit:)

  • Portable: rabit is light weight and runs everywhere
    • Rabit is a library instead of a framework, a program only needs to link the library to run
    • Rabit only replies on a mechanism to start program, which was provided by most framework
    • You can run rabit programs on many platforms, including Yarn(Hadoop), MPI using the same code
  • Scalable and Flexible: rabit runs fast
    • Rabit program use Allreduce to communicate, and do not suffer the cost between iterations of MapReduce abstraction.
    • Programs can call rabit functions in any order, as opposed to frameworks where callbacks are offered and called by the framework, i.e. inversion of control principle.
    • Programs persist over all the iterations, unless they fail and recover.
  • Reliable: rabit dig burrows to avoid disasters
    • Rabit programs can recover the model and results using synchronous function calls.

Use Rabit

  • Type make in the root folder will compile the rabit library in lib folder
  • Add lib to the library path and include to the include path of compiler
  • Languages: You can use rabit in C++ and python
    • It is also possible to port the library to other languages

Contributing

Rabit is an open-source library, contributions are welcomed, including:

  • The rabit core library.
  • Customized tracker script for new platforms and interface of new languages.
  • Toolkits, benchmarks, resource (links to related repos).
  • Tutorial and examples about the library.
Description
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Readme 33 MiB
Languages
C++ 45.5%
Python 20.3%
Cuda 15.2%
R 6.8%
Scala 6.4%
Other 5.6%