7091 Commits

Author SHA1 Message Date
tqchen
155ed3a814 seems a OK version of reset, start to work on decide exec 2014-11-29 22:22:51 -08:00
tqchen
5b0bb53184 refactor code style, reset link still need thoughts 2014-11-29 20:15:27 -08:00
tqchen
42505f473d finish reset link log 2014-11-29 15:14:43 -08:00
tqchen
98756c068a livelock in oob send recv 2014-11-28 21:58:15 -08:00
tqchen
aa54a038f2 livelock in oob send recv 2014-11-28 21:56:58 -08:00
tqchen
a30075794b initial version of robust engine, add discard link, need more random mock test, next milestone will be recovery 2014-11-28 15:56:12 -08:00
nachocano
a8128493c2 execute it like this: ./test.sh 4 4000 testcase0.conf ./
Now we are passing the folder where the round instances are saved.
The problem is that calling utils::Check or utils::Assert on 1 or 2 nodes, shutdowns all of them. Only those should be shutdown and this will work. There maybe some other mechanism to shutdown a particular node. Tianqi?
2014-11-28 01:48:26 -08:00
nachocano
faed8285cd execute it like ./test.sh 4 4000 testcase0.conf to obtain a successful execution
updating mock. It now wraps the calls to sync and reads config from configuration file.
I believe it's better not to use the preprocessor directive, i.e. not to put any test code in the engine_tcp. I just call the mock in the test_allreduce file. It's a file purely for testing purposes, so it's fine to use the mock there.
2014-11-28 00:16:35 -08:00
nachocano
21f3f3eec4 adding const to variable to comply with google code convention...
may need to change more stuff though. Taint what else do you mean? Spaces, tabs, names?
2014-11-27 17:03:31 -08:00
tqchen
2f1ba40786 change in socket, to pass out error code 2014-11-27 16:17:07 -08:00
nachocano
c565104491 adding some references to mock inside TEST preprocessor directive.
It shouldn't be an assert because it shutdowns the process. Instead should check on the value and return some sort of error, so that we can recover.
The mock contains queues, indexed by the rank of the process. For each node, you can configure the behavior you expect (success or failure for now) when you call any of the methods (AllReduce, Broadcast, LoadCheckPoint and CheckPoint)... If you call several times AllReduce, the outputs will pop from the queue, i.e., first you can retrieve a success, then a failure and so on.
Pretty basic for now, need to tune it better
2014-11-26 17:24:29 -08:00
nachocano
54fcff189f dummy mock for now 2014-11-26 16:37:23 -08:00
Tianqi Chen
5ae99372d6 Update simple_dmatrix-inl.hpp 2014-11-26 09:13:49 -08:00
Tianqi Chen
be5fb800d5 Merge pull request #112 from tfgit/master
Fixed README
2014-11-25 19:29:41 -08:00
Ted Fujimoto
baf41d589d Fixed README 2014-11-25 22:17:36 -05:00
Tianqi Chen
8d7dbc65b3 Merge pull request #111 from tfgit/master
OS X OpenMP support instructions
2014-11-25 19:12:42 -08:00
Ted Fujimoto
198489438f Added OS X OpenMP instructions 2014-11-25 21:42:13 -05:00
Ted Fujimoto
c356a0acc2 Remove tools folder 2014-11-25 21:27:50 -05:00
tqchen
d37f38c455 initial version of allreduce 2014-11-25 16:15:56 -08:00
Tianqi Chen
5e5bdda491 Initial commit 2014-11-25 14:37:18 -08:00
Tianqi Chen
cdcfa5687a Update socket.h 2014-11-23 22:46:57 -08:00
tqchen
f53be2884a ok 2014-11-23 22:42:44 -08:00
Tianqi Chen
f805ecb5f3 fix a bug in node sindex set 2014-11-23 22:35:30 -08:00
tqchen
3e162ceda6 windows strange 2014-11-23 22:21:15 -08:00
tqchen
35bf2101fe seems a prob in win 2014-11-23 22:18:28 -08:00
Tianqi Chen
fde580b08e fix windows run 2014-11-23 22:12:55 -08:00
tqchen
77ffd0465b ok 2014-11-23 21:36:22 -08:00
tqchen
78ca72b9c7 start work on win 2014-11-23 21:34:15 -08:00
tqchen
d2f151ef5a bring it back alive again 2014-11-23 21:27:16 -08:00
Tianqi Chen
7f3dc967cf changes in socket, a bit work in linux side first 2014-11-23 21:21:52 -08:00
tqchen
db2adb6885 start check windows compatiblity 2014-11-23 20:59:10 -08:00
Tianqi Chen
2e444f8338 remove warning from MSVC need another round of check 2014-11-23 20:52:13 -08:00
tqchen
b55fe80350 add row map example 2014-11-23 18:15:42 -08:00
tqchen
372de9f968 check in conf 2014-11-23 17:35:21 -08:00
tqchen
373620503a ok 2014-11-23 14:08:34 -08:00
tqchen
5f08313cb2 make wrapper ok 2014-11-23 14:03:59 -08:00
tqchen
69b2f31098 bugfix in allreduce 2014-11-23 11:31:34 -08:00
tqchen
115424826b basic test pass 2014-11-23 11:15:48 -08:00
tqchen
c499dd0f0c start testing allreduce 2014-11-22 22:55:43 -08:00
tqchen
cb1c34aef0 add nonblocking mode 2014-11-22 17:15:05 -08:00
tqchen
67c5d8a2e6 allreduce server side ok, need to add master 2014-11-22 17:12:19 -08:00
tqchen
4864220702 have the function, ready, need initializer 2014-11-22 12:15:30 -08:00
tqchen
7ec3fc936a check in allreduce tcp, check if there could be more concise form 2014-11-21 22:54:11 -08:00
tqchen
b6e1b19205 checkin socket module 2014-11-21 16:09:28 -08:00
tqchen
84dcab6795 checkin socket module 2014-11-21 16:09:26 -08:00
Tianqi Chen
c29a600d46 Update README.md 2014-11-21 09:48:59 -08:00
tqchen
168bb0d0c9 add predict leaf indices 2014-11-21 09:32:09 -08:00
Tianqi Chen
6ed82edad7 Merge pull request #106 from tqchen/master
pull master into unity
2014-11-21 08:56:01 -08:00
Tianqi Chen
d4103ea7ea Update README.md 2014-11-20 22:01:26 -08:00
Tong He
c16e0f6809 Update predict.xgb.Booster.R
add parameter missing
2014-11-20 15:19:53 -08:00