diff --git a/guide/README.md b/guide/README.md index b209c2ec6..a094592c6 100644 --- a/guide/README.md +++ b/guide/README.md @@ -58,6 +58,24 @@ Rabit provides different reduction operators, for example, if you change ```op: the reduction operation will be a summation, and the result will become ```a = {1, 3, 5}```. You can also run the example with different processes by setting -n to different values. +In order to make the library available for a wider range of developers, we decided to provide a python wrapper to our C++ code. Developers can now program rabit applications in Python! The same example as before can be found in [basic.py](basic.py): + +```python +import numpy as np +import rabit +rabit.init() +n = 3 +rank = rabit.get_rank() +a = np.zeros(n) +for i in xrange(n): + a[i] = rank + i + +print '@node[%d] before-allreduce: a=%s' % (rank, str(a)) +a = rabit.allreduce(a, rabit.MAX) +print '@node[%d] after-allreduce: a=%s' % (rank, str(a)) +rabit.finalize() +``` + Broadcast is another method provided by rabit besides Allreduce. This function allows one node to broadcast its local data to all other nodes. The following code in [broadcast.cc](broadcast.cc) broadcasts a string from node 0 to all other nodes. @@ -85,6 +103,22 @@ The following command starts the program with three worker processes. ``` Besides strings, rabit also allows to broadcast constant size array and vectors. +The counterpart in python can be found in [broadcast.py](broadcast.py). Here is a snippet so that you can get a better sense of how simple is to use the wrapper: + +```python +import rabit +rabit.init() +n = 3 +rank = rabit.get_rank() +s = None +if rank == 0: + s = {'hello world':100, 2:3} +print '@node[%d] before-broadcast: s=\"%s\"' % (rank, str(s)) +s = rabit.broadcast(s, 0) +print '@node[%d] after-broadcast: s=\"%s\"' % (rank, str(s)) +rabit.finalize() +``` + Common Use Case ===== Many distributed machine learning algorithms involve splitting the data into different nodes, @@ -266,13 +300,4 @@ recovered node fetches its latest checkpoint and the results of Allreduce/Broadcast calls after the checkpoint from some alive nodes. This is just a conceptual introduction to rabit's fault tolerance model. The actual implementation is more sophisticated, -and can deal with more complicated cases such as multiple nodes failure and node failure during recovery phase. - -Python Wrapper -===== -In order to make the library available for a wider range of developers, we decided to provide a python wrapper to our C++ code. - -Developers can now program rabit applications in Python! We provide a couple of examples: - -* [./basic.py](./basic.py) : [./basic.cc](./basic.cc) counterpart, explained above. -* [./broadcast.py](./broadcast.py) : [./broadcast.cc](./broadcast.cc) counterpart, explained above. +and can deal with more complicated cases such as multiple nodes failure and node failure during recovery phase. \ No newline at end of file