Now we are passing the folder where the round instances are saved. The problem is that calling utils::Check or utils::Assert on 1 or 2 nodes, shutdowns all of them. Only those should be shutdown and this will work. There maybe some other mechanism to shutdown a particular node. Tianqi?
10 lines
250 B
Plaintext
10 lines
250 B
Plaintext
# Test Case example config
|
|
# You configure which methods should fail
|
|
# Format <round>_<rank> = <operation>
|
|
# <operation> can be one of the following = allreduce, broadcast, loadcheckpoint, checkpoint
|
|
|
|
1_0 = allreduce
|
|
1_1 = broadcast
|
|
|
|
2_2 = allreduce
|