yarn is part of hadoop script
This commit is contained in:
parent
a120edc56e
commit
6b651176a3
@ -197,13 +197,10 @@ Rabit is a portable library that can run on multiple platforms.
|
|||||||
* You can use [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py) to run rabit programs on hadoop
|
* You can use [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py) to run rabit programs on hadoop
|
||||||
* This will start n rabit programs as mappers of MapReduce
|
* This will start n rabit programs as mappers of MapReduce
|
||||||
* Each program can read its portion of data from stdin
|
* Each program can read its portion of data from stdin
|
||||||
* Yarn is highly recommended, since Yarn allows specifying number of cpus and memory of each mapper:
|
* Yarn(Hadoop 2.0 or higher) is highly recommended, since Yarn allows specifying number of cpus and memory of each mapper:
|
||||||
- This allows multi-threading programs in each node, which can be more efficient
|
- This allows multi-threading programs in each node, which can be more efficient
|
||||||
- An easy multi-threading solution could be to use OpenMP with rabit code
|
- An easy multi-threading solution could be to use OpenMP with rabit code
|
||||||
|
|
||||||
#### Running Rabit on Yarn
|
|
||||||
* To Be modified from [../tracker/rabit_hadoop.py](../tracker/rabit_hadoop.py)
|
|
||||||
|
|
||||||
#### Running Rabit using MPI
|
#### Running Rabit using MPI
|
||||||
* You can submit rabit programs to an MPI cluster using [../tracker/rabit_mpi.py](../tracker/rabit_mpi.py).
|
* You can submit rabit programs to an MPI cluster using [../tracker/rabit_mpi.py](../tracker/rabit_mpi.py).
|
||||||
* If you linked your code against librabit_mpi.a, then you can directly use mpirun to submit the job
|
* If you linked your code against librabit_mpi.a, then you can directly use mpirun to submit the job
|
||||||
|
|||||||
@ -89,8 +89,7 @@ assert out[0] == 'Hadoop', 'cannot parse hadoop version string'
|
|||||||
hadoop_version = out[1].split('.')
|
hadoop_version = out[1].split('.')
|
||||||
use_yarn = int(hadoop_version[0]) >= 2
|
use_yarn = int(hadoop_version[0]) >= 2
|
||||||
|
|
||||||
if not use_yarn:
|
print 'Current Hadoop Version is %s' % out[1]
|
||||||
print 'Current Hadoop Version is %s' % out[1]
|
|
||||||
|
|
||||||
def hadoop_streaming(nworker, worker_args, use_yarn):
|
def hadoop_streaming(nworker, worker_args, use_yarn):
|
||||||
fset = set()
|
fset = set()
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user