This commit is contained in:
tqchen 2015-01-18 19:55:04 -08:00
parent f332750359
commit c7282acb2a
2 changed files with 23 additions and 0 deletions

View File

@ -7,3 +7,22 @@ Rabit Documentation
Parameters
====
This section list all the parameters that can be passed to rabit::Init function as argv.
All the parameters are passed in as string in format of ```parameter-name=parameter-value```.
In most setting these parameters have default value or will be automatically detected,
and do not need to be manually configured.
* rabit_tracker_uri [passed in automatically by tracker]
- The uri/ip of rabit tracker
* rabit_tracker_port [passed in automatically by tracker]
- The port of rabit tracker
* rabit_task_id [automatically detected]
- The unique identifier of computing process
- When running on hadoop, this is automatically extracted from enviroment variable
* rabit_reduce_buffer [default = 256MB]
- The memory buffer used to store intermediate result of reduction
- Format "digits + unit", can be 128M, 1G
* rabit_global_replica [default = 5]
- Number of replication copies of result kept for each Allreduce/Broadcast call
* rabit_local_replica [default = 2]
- Number of replication of local model in check point

View File

@ -268,6 +268,9 @@ Because the different nature of the two types of models, different strategy will
nodes (selected using a ring replication strategy). The checkpoint is only saved in the memory without touching the disk which makes rabit programs more efficient.
User is encouraged to use ```global_model``` only when is sufficient for better efficiency.
To enable a model class to be checked pointed, user can implement a [serialization interface](../include/rabit_serialization.h). The serialization interface already
provide serialization functions of STL vector and string. For python API, user can checkpoint any python object that can be pickled.
There is a special Checkpoint function called [LazyCheckpoint](http://homes.cs.washington.edu/~tqchen/rabit/doc/namespacerabit.html#a99f74c357afa5fba2c80cc0363e4e459),
which can be used for ```global_model``` only cases under certain condition.
When LazyCheckpoint is called, no action is taken and the rabit engine only remembers the pointer to the model.
@ -282,6 +285,7 @@ LazyCheckPoint, code1, Allreduce, code2, Broadcast, code3, LazyCheckPoint
The user must only change the model in code3. Such condition can usually be satiesfied in many scenarios, and user can use LazyCheckpoint to further
improve the efficiency of the program.
Compile Programs with Rabit
====
Rabit is a portable library, to use it, you only need to include the rabit header file.