doc
This commit is contained in:
parent
f332750359
commit
c7282acb2a
@ -7,3 +7,22 @@ Rabit Documentation
|
||||
|
||||
Parameters
|
||||
====
|
||||
This section list all the parameters that can be passed to rabit::Init function as argv.
|
||||
All the parameters are passed in as string in format of ```parameter-name=parameter-value```.
|
||||
In most setting these parameters have default value or will be automatically detected,
|
||||
and do not need to be manually configured.
|
||||
|
||||
* rabit_tracker_uri [passed in automatically by tracker]
|
||||
- The uri/ip of rabit tracker
|
||||
* rabit_tracker_port [passed in automatically by tracker]
|
||||
- The port of rabit tracker
|
||||
* rabit_task_id [automatically detected]
|
||||
- The unique identifier of computing process
|
||||
- When running on hadoop, this is automatically extracted from enviroment variable
|
||||
* rabit_reduce_buffer [default = 256MB]
|
||||
- The memory buffer used to store intermediate result of reduction
|
||||
- Format "digits + unit", can be 128M, 1G
|
||||
* rabit_global_replica [default = 5]
|
||||
- Number of replication copies of result kept for each Allreduce/Broadcast call
|
||||
* rabit_local_replica [default = 2]
|
||||
- Number of replication of local model in check point
|
||||
|
||||
@ -268,6 +268,9 @@ Because the different nature of the two types of models, different strategy will
|
||||
nodes (selected using a ring replication strategy). The checkpoint is only saved in the memory without touching the disk which makes rabit programs more efficient.
|
||||
User is encouraged to use ```global_model``` only when is sufficient for better efficiency.
|
||||
|
||||
To enable a model class to be checked pointed, user can implement a [serialization interface](../include/rabit_serialization.h). The serialization interface already
|
||||
provide serialization functions of STL vector and string. For python API, user can checkpoint any python object that can be pickled.
|
||||
|
||||
There is a special Checkpoint function called [LazyCheckpoint](http://homes.cs.washington.edu/~tqchen/rabit/doc/namespacerabit.html#a99f74c357afa5fba2c80cc0363e4e459),
|
||||
which can be used for ```global_model``` only cases under certain condition.
|
||||
When LazyCheckpoint is called, no action is taken and the rabit engine only remembers the pointer to the model.
|
||||
@ -282,6 +285,7 @@ LazyCheckPoint, code1, Allreduce, code2, Broadcast, code3, LazyCheckPoint
|
||||
The user must only change the model in code3. Such condition can usually be satiesfied in many scenarios, and user can use LazyCheckpoint to further
|
||||
improve the efficiency of the program.
|
||||
|
||||
|
||||
Compile Programs with Rabit
|
||||
====
|
||||
Rabit is a portable library, to use it, you only need to include the rabit header file.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user