update docs
This commit is contained in:
parent
a607047aa1
commit
6a1167611c
@ -13,7 +13,7 @@ All these features comes from the facts about small rabbit:)
|
||||
* Portable: rabit is light weight and runs everywhere
|
||||
- Rabit is a library instead of a framework, a program only needs to link the library to run
|
||||
- Rabit only replies on a mechanism to start program, which was provided by most framework
|
||||
- You can run rabit programs on many platforms, including Hadoop, MPI using the same code
|
||||
- You can run rabit programs on many platforms, including Yarn(Hadoop), MPI using the same code
|
||||
* Scalable and Flexible: rabit runs fast
|
||||
* Rabit program use Allreduce to communicate, and do not suffer the cost between iterations of MapReduce abstraction.
|
||||
- Programs can call rabit functions in any order, as opposed to frameworks where callbacks are offered and called by the framework, i.e. inversion of control principle.
|
||||
|
||||
@ -341,12 +341,11 @@ Rabit is a portable library that can run on multiple platforms.
|
||||
* This script will restart the program when it exits with -2, so it can be used for [mock test](#link-against-mock-test-library)
|
||||
|
||||
#### Running Rabit on Hadoop
|
||||
* You can use [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py) to run rabit programs on hadoop
|
||||
* This will start n rabit programs as mappers of MapReduce
|
||||
* Each program can read its portion of data from stdin
|
||||
* Yarn(Hadoop 2.0 or higher) is highly recommended, since Yarn allows specifying number of cpus and memory of each mapper:
|
||||
* You can use [../tracker/rabit_yarn.py](../tracker/rabit_yarn.py) to run rabit programs as Yarn application
|
||||
* This will start rabit programs as yarn applications
|
||||
- This allows multi-threading programs in each node, which can be more efficient
|
||||
- An easy multi-threading solution could be to use OpenMP with rabit code
|
||||
* It is also possible to run rabit program via hadoop streaming, however, YARN is highly recommended.
|
||||
|
||||
#### Running Rabit using MPI
|
||||
* You can submit rabit programs to an MPI cluster using [../tracker/rabit_mpi.py](../tracker/rabit_mpi.py).
|
||||
@ -358,15 +357,15 @@ tracker scripts, such as [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py
|
||||
|
||||
You will need to implement a platform dependent submission function with the following definition
|
||||
```python
|
||||
def fun_submit(nworkers, worker_args):
|
||||
def fun_submit(nworkers, worker_args, worker_envs):
|
||||
"""
|
||||
customized submit script, that submits nslave jobs,
|
||||
each must contain args as parameter
|
||||
note this can be a lambda closure
|
||||
Parameters
|
||||
nworkers number of worker processes to start
|
||||
worker_args tracker information which must be passed to the arguments
|
||||
this usually includes the parameters of master_uri and port, etc.
|
||||
worker_args addtiional arguments that needs to be passed to worker
|
||||
worker_envs enviroment variables that need to be set to the worker
|
||||
"""
|
||||
```
|
||||
The submission function should start nworkers processes in the platform, and append worker_args to the end of the other arguments.
|
||||
@ -374,7 +373,7 @@ Then you can simply call ```tracker.submit``` with fun_submit to submit jobs to
|
||||
|
||||
Note that the current rabit tracker does not restart a worker when it dies, the restart of a node is done by the platform, otherwise we should write the fail-restart logic in the custom script.
|
||||
* Fail-restart is usually provided by most platforms.
|
||||
* For example, mapreduce will restart a mapper when it fails
|
||||
- rabit-yarn provides such functionality in YARN
|
||||
|
||||
Fault Tolerance
|
||||
=====
|
||||
|
||||
@ -5,6 +5,7 @@ It also contain links to the Machine Learning packages that uses rabit.
|
||||
|
||||
* Contribution of toolkits, examples, benchmarks is more than welcomed!
|
||||
|
||||
|
||||
Toolkits
|
||||
====
|
||||
* [KMeans Clustering](kmeans)
|
||||
@ -14,5 +15,3 @@ Toolkits
|
||||
10 times faster than existing packages
|
||||
- Rabit carries xgboost to distributed enviroment, inheritating all the benefits of xgboost
|
||||
single node version, and scale it to even larger problems
|
||||
|
||||
|
||||
|
||||
12
tracker/README.md
Normal file
12
tracker/README.md
Normal file
@ -0,0 +1,12 @@
|
||||
Trackers
|
||||
=====
|
||||
This folder contains tracker scripts that can be used to submit yarn jobs to different platforms,
|
||||
the example guidelines are in the script themselfs
|
||||
|
||||
***Supported Platforms***
|
||||
* Local demo: [rabit_demo.py](rabit_demo.py)
|
||||
* MPI: [rabit_mpi.py](rabit_mpi.py)
|
||||
* Yarn (Hadoop): [rabit_yarn.py](rabit_yarn.py)
|
||||
- It is also possible to submit via hadoop streaming with rabit_hadoop_streaming.py
|
||||
- However, it is higly recommended to use rabit_yarn.py because this will allocate resources more precisely and fits machine learning scenarios
|
||||
|
||||
@ -1,7 +1,11 @@
|
||||
#!/usr/bin/python
|
||||
"""
|
||||
Deprecated
|
||||
|
||||
This is a script to submit rabit job using hadoop streaming.
|
||||
It will submit the rabit process as mappers of MapReduce.
|
||||
|
||||
This script is deprecated, it is highly recommended to use rabit_yarn.py instead
|
||||
"""
|
||||
import argparse
|
||||
import sys
|
||||
@ -91,6 +95,8 @@ out = out.split('\n')[0].split()
|
||||
assert out[0] == 'Hadoop', 'cannot parse hadoop version string'
|
||||
hadoop_version = out[1].split('.')
|
||||
use_yarn = int(hadoop_version[0]) >= 2
|
||||
if use_yarn:
|
||||
warnings.warn('It is highly recommended to use rabit_yarn.py to submit jobs to yarn instead', stacklevel = 2)
|
||||
|
||||
print 'Current Hadoop Version is %s' % out[1]
|
||||
|
||||
|
||||
@ -1,7 +1,7 @@
|
||||
#!/usr/bin/python
|
||||
"""
|
||||
This is a script to submit rabit job using hadoop streaming.
|
||||
It will submit the rabit process as mappers of MapReduce.
|
||||
This is a script to submit rabit job via Yarn
|
||||
rabit will run as a Yarn application
|
||||
"""
|
||||
import argparse
|
||||
import sys
|
||||
|
||||
5
yarn/README.md
Normal file
5
yarn/README.md
Normal file
@ -0,0 +1,5 @@
|
||||
rabit-yarn
|
||||
=====
|
||||
* This folder contains Application code to allow rabit run on Yarn.
|
||||
* You can use [../tracker/rabit_yarn.py](../tracker/rabit_yarn.py) to submit the job
|
||||
- run ```./build.sh``` to build the jar, before using the script
|
||||
Loading…
x
Reference in New Issue
Block a user