update docs

2015-03-09 13:00:34 -07:00 · 2015-03-09 13:00:34 -07:00 · 6a1167611c
commit 6a1167611c
parent a607047aa1
7 changed files with 34 additions and 13 deletions
--- a/README.md
+++ b/README.md
@ -13,7 +13,7 @@ All these features comes from the facts about small rabbit:)
 * Portable: rabit is light weight and runs everywhere
  - Rabit is a library instead of a framework, a program only needs to link the library to run
  - Rabit only replies on a mechanism to start program, which was provided by most framework
-  - You can run rabit programs on many platforms, including Hadoop, MPI using the same code
+  - You can run rabit programs on many platforms, including Yarn(Hadoop), MPI using the same code
 * Scalable and Flexible: rabit runs fast
  * Rabit program use Allreduce to communicate, and do not suffer the cost between iterations of MapReduce abstraction.
  - Programs can call rabit functions in any order, as opposed to frameworks where callbacks are offered and called by the framework, i.e. inversion of control principle.
--- a/guide/README.md
+++ b/guide/README.md
@ -341,12 +341,11 @@ Rabit is a portable library that can run on multiple platforms.
 * This script will restart the program when it exits with -2, so it can be used for [mock test](#link-against-mock-test-library)

 #### Running Rabit on Hadoop
-* You can use [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py) to run rabit programs on hadoop
-* This will start n rabit programs as mappers of MapReduce
-* Each program can read its portion of data from stdin
-* Yarn(Hadoop 2.0 or higher) is highly recommended, since Yarn allows specifying number of cpus and memory of each mapper:
+* You can use [../tracker/rabit_yarn.py](../tracker/rabit_yarn.py) to run rabit programs as Yarn application
+* This will start rabit programs as yarn applications
  - This allows multi-threading programs in each node, which can be more efficient
  - An easy multi-threading solution could be to use OpenMP with rabit code
+* It is also possible to run rabit program via hadoop streaming, however, YARN is highly recommended.

 #### Running Rabit using MPI
 * You can submit rabit programs to an MPI cluster using [../tracker/rabit_mpi.py](../tracker/rabit_mpi.py).
@ -358,15 +357,15 @@ tracker scripts, such as [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py

 You will need to implement a platform dependent submission function with the following definition
 ```python
-def fun_submit(nworkers, worker_args):
+def fun_submit(nworkers, worker_args, worker_envs):
    """
      customized submit script, that submits nslave jobs,
      each must contain args as parameter
      note this can be a lambda closure
      Parameters
         nworkers number of worker processes to start
-         worker_args tracker information which must be passed to the arguments 
-              this usually includes the parameters of master_uri and port, etc.
+         worker_args addtiional arguments that needs to be passed to worker
+         worker_envs enviroment variables that need to be set to the worker
    """
 ```
 The submission function should start nworkers processes in the platform, and append worker_args to the end of the other arguments.
@ -374,7 +373,7 @@ Then you can simply call ```tracker.submit``` with fun_submit to submit jobs to

 Note that the current rabit tracker does not restart a worker when it dies, the restart of a node is done by the platform, otherwise we should write the fail-restart logic in the custom script.
 * Fail-restart is usually provided by most platforms.
-* For example, mapreduce will restart a mapper when it fails
+  - rabit-yarn provides such functionality in YARN

 Fault Tolerance
 =====
--- a/rabit-learn/README.md
+++ b/rabit-learn/README.md
@ -5,6 +5,7 @@ It also contain links to the Machine Learning packages that uses rabit.

 * Contribution of toolkits, examples, benchmarks is more than welcomed!

+
 Toolkits
 ====
 * [KMeans Clustering](kmeans)
@ -14,5 +15,3 @@ Toolkits
    10 times faster than existing packages
  - Rabit carries xgboost to distributed enviroment, inheritating all the benefits of xgboost
    single node version, and scale it to even larger problems
-
-
--- a/tracker/README.md
+++ b/tracker/README.md
@ -0,0 +1,12 @@
+Trackers
+=====
+This folder contains tracker scripts that can be used to submit yarn jobs to different platforms,
+the example guidelines are in the script themselfs
+
+***Supported Platforms***
+* Local demo: [rabit_demo.py](rabit_demo.py)
+* MPI: [rabit_mpi.py](rabit_mpi.py)
+* Yarn (Hadoop): [rabit_yarn.py](rabit_yarn.py)
+  - It is also possible to submit via hadoop streaming with rabit_hadoop_streaming.py
+  - However, it is higly recommended to use rabit_yarn.py because this will allocate resources more precisely and fits machine learning scenarios
+
--- a/tracker/rabit_hadoop_streaming.py
+++ b/tracker/rabit_hadoop_streaming.py
@ -1,7 +1,11 @@
 #!/usr/bin/python
 """
+Deprecated
+
 This is a script to submit rabit job using hadoop streaming.
 It will submit the rabit process as mappers of MapReduce.
+
+This script is deprecated, it is highly recommended to use rabit_yarn.py instead
 """
 import argparse
 import sys
@ -91,6 +95,8 @@ out = out.split('\n')[0].split()
 assert out[0] == 'Hadoop', 'cannot parse hadoop version string'
 hadoop_version = out[1].split('.')
 use_yarn = int(hadoop_version[0]) >= 2
+if use_yarn:
+    warnings.warn('It is highly recommended to use rabit_yarn.py to submit jobs to yarn instead', stacklevel = 2)

 print 'Current Hadoop Version is %s' % out[1]

--- a/tracker/rabit_yarn.py
+++ b/tracker/rabit_yarn.py
@ -1,7 +1,7 @@
 #!/usr/bin/python
 """
-This is a script to submit rabit job using hadoop streaming.
-It will submit the rabit process as mappers of MapReduce.
+This is a script to submit rabit job via Yarn
+rabit will run as a Yarn application
 """
 import argparse
 import sys
--- a/yarn/README.md
+++ b/yarn/README.md
@ -0,0 +1,5 @@
+rabit-yarn
+=====
+* This folder contains Application code to allow rabit run on Yarn.
+* You can use [../tracker/rabit_yarn.py](../tracker/rabit_yarn.py) to submit the job
+  - run ```./build.sh``` to build the jar, before using the script