56aad86231adding incomplete kmeans. I'm having a problem with the broadcast, and still need to implement the logic
nachocano
2014-12-03 01:16:13 -08:00
ed1de6df80change AllReduce to Allreduce
tqchen
2014-12-02 21:11:48 -08:00
2e536eda29check in the recover strategy
tqchen
2014-11-30 11:42:59 -08:00
155ed3a814seems a OK version of reset, start to work on decide exec
tqchen
2014-11-29 22:22:51 -08:00
5b0bb53184refactor code style, reset link still need thoughts
tqchen
2014-11-29 20:15:27 -08:00
42505f473dfinish reset link log
tqchen
2014-11-29 15:14:43 -08:00
98756c068alivelock in oob send recv
tqchen
2014-11-28 21:58:15 -08:00
aa54a038f2livelock in oob send recv
tqchen
2014-11-28 21:56:58 -08:00
a30075794binitial version of robust engine, add discard link, need more random mock test, next milestone will be recovery
tqchen
2014-11-28 15:56:12 -08:00
a8128493c2execute it like this: ./test.sh 4 4000 testcase0.conf ./
nachocano
2014-11-28 01:48:26 -08:00
faed8285cdexecute it like ./test.sh 4 4000 testcase0.conf to obtain a successful execution
nachocano
2014-11-28 00:16:35 -08:00
21f3f3eec4adding const to variable to comply with google code convention... may need to change more stuff though. Taint what else do you mean? Spaces, tabs, names?
nachocano
2014-11-27 17:03:31 -08:00
2f1ba40786change in socket, to pass out error code
tqchen
2014-11-27 16:17:07 -08:00
c565104491adding some references to mock inside TEST preprocessor directive. It shouldn't be an assert because it shutdowns the process. Instead should check on the value and return some sort of error, so that we can recover. The mock contains queues, indexed by the rank of the process. For each node, you can configure the behavior you expect (success or failure for now) when you call any of the methods (AllReduce, Broadcast, LoadCheckPoint and CheckPoint)... If you call several times AllReduce, the outputs will pop from the queue, i.e., first you can retrieve a success, then a failure and so on. Pretty basic for now, need to tune it better
nachocano
2014-11-26 17:24:29 -08:00
54fcff189fdummy mock for now
nachocano
2014-11-26 16:37:23 -08:00