This commit is contained in:
tqchen 2014-12-03 22:32:30 -08:00
parent 0e9b64649a
commit 656a8fa3a2

View File

@ -10,16 +10,21 @@ Design Note
* The global statistics collection is done by using Allreduce * The global statistics collection is done by using Allreduce
* Currently, Rabit is not good at problems where model is distributed across nodes, other abstractions might suits the purpose (for example [parameter server](https://github.com/mli/parameter_server)) * Currently, Rabit is not good at problems where model is distributed across nodes, other abstractions might suits the purpose (for example [parameter server](https://github.com/mli/parameter_server))
Design Goal
====
* rabit should run fast
* rabit is light weight
* rabit dig safe burrows to avoid disasters
Features Features
==== ====
* Portable library * Portable library
- Rabit is a library instead of framework, program only need to link the library to run, without restricting to a single framework. - Rabit is a library instead of framework, program only need to link the library to run, without restricting to a single framework.
* Flexibility in programming * Flexibility in programming
- Many frameworks only allow user to define limited functions(map,reduce), and the framework call these operation - Programs call rabit functions, Allreduce, CheckPoint in any sequence, as opposed to defines limited functions and being called.
- When program with rabit, programs call rabit functions, Allreduce, CheckPoint in any sequence
- Program persist over all the iterations, unless it fails and recover - Program persist over all the iterations, unless it fails and recover
* Fault tolerance * Fault tolerance
- Rabit program can recover model and results of syncrhonization functions calls(Allreduce and Broadcast) - Rabit program can recover model and results of syncrhonization functions calls
* MPI compatible * MPI compatible
- Codes using rabit interface naturally compiles with existing MPI compiler - Codes using rabit interface naturally compiles with existing MPI compiler
- User can fall back to use MPI Allreduce if they like with no code modification - User can fall back to use MPI Allreduce if they like with no code modification
@ -28,9 +33,4 @@ Persistence of Program
==== ====
Many complicated Machine learning algorithm involves things like temporal memory allocation, result caching. It is good to have a persist program that runs over iterations and keeps the resources instead of re-allocate and re-compute the caching every time. Rabit allows the process to persist over all iterations. Many complicated Machine learning algorithm involves things like temporal memory allocation, result caching. It is good to have a persist program that runs over iterations and keeps the resources instead of re-allocate and re-compute the caching every time. Rabit allows the process to persist over all iterations.
Design Goal
====
* rabit should run fast
* rabit is light weight
* rabit dig safe burrows to avoid disasters