57 Commits

Author SHA1 Message Date
tqchen
6dbaddd2b9 ok 2015-01-14 22:11:00 -08:00
tqchen
968b33ec79 set all tracker thread to deamon 2015-01-14 12:05:00 -08:00
tqchen
87c7817124 add lazy check, need test, find a race condition 2015-01-14 11:58:43 -08:00
tqchen
348a1e7619 change default behavior to behave normal 2015-01-13 22:21:15 -08:00
tqchen
532575b752 ok 2015-01-13 14:41:37 -08:00
tqchen
3419cf9aa7 add auto caching of python in hadoop script, mock test module to python, with checkpt 2015-01-13 14:29:10 -08:00
tqchen
1b4921977f update doc 2015-01-03 05:20:18 -08:00
tqchen
bfb9aa3d77 add native script 2014-12-30 04:37:50 -08:00
tqchen
d64d0ef1dc cleanup submission script 2014-12-29 06:11:58 -08:00
tqchen
12399a1d42 add more mocktest 2014-12-21 17:59:12 -08:00
tqchen
e40047f9c2 new mock test 2014-12-20 18:38:54 -08:00
tqchen
925d014271 change file structure 2014-12-20 16:19:54 -08:00
tqchen
6151899ce2 add tracker print 2014-12-19 18:40:06 -08:00
tqchen
6bf282c6c2 isolate iserializable 2014-12-19 17:36:42 -08:00
tqchen
8c35cff02c improve script 2014-12-19 04:21:16 -08:00
tqchen
9f42b78a18 improve tracker script 2014-12-19 04:20:45 -08:00
tqchen
1754fdbf4e enable support for lambda preprocessing function, and c++11 2014-12-19 02:00:43 -08:00
tqchen
58331067f8 cleanup testcases 2014-12-18 23:50:59 -08:00
tqchen
c8faed0b54 pass local model recover test 2014-12-18 18:53:58 -08:00
tqchen
dbd05a65b5 nice fix, start check local check 2014-12-18 18:39:24 -08:00
tqchen
3f22596e3c check in license 2014-12-09 20:57:54 -08:00
tqchen
2750679270 normal state running ok 2014-12-07 20:57:29 -08:00
nachocano
20b03e781c to run all executables 2014-12-06 15:37:09 -08:00
nachocano
fcf2f0a03d to stderr 2014-12-06 15:22:29 -08:00
nachocano
659b9cd517 changing number of repetitions 2014-12-06 15:14:14 -08:00
nachocano
9ed59e71f6 speed runner 2014-12-06 12:09:40 -08:00
nachocano
e0053c62e1 adding executable 2014-12-06 12:05:08 -08:00
nachocano
8f0d7d1d3e changing to -ho not to conflict with help 2014-12-06 12:01:05 -08:00
nachocano
771891491c Merge branch 'master' of https://github.com/tqchen/allreduce 2014-12-06 11:59:22 -08:00
nachocano
f203d13efc speed runner 2014-12-06 11:59:16 -08:00
tqchen
4a7d84e861 chg string bcast 2014-12-06 11:25:08 -08:00
tqchen
1519f74f3c ok 2014-12-06 11:20:52 -08:00
tqchen
0e012cb05e add speed test 2014-12-06 11:05:24 -08:00
tqchen
19631ecef6 more tracker renaming 2014-12-06 09:24:12 -08:00
nachocano
bb7d6814a7 creating initial version of hadoop submit script. Not working.
Not sure how to get the master uri and port. I believe I cannot do it before I launch the job.

Updating the name from submit_job to submit_job_mpi
2014-12-05 03:27:02 -08:00
tqchen
90b9f1a98a add keepalive script 2014-12-03 15:04:30 -08:00
tqchen
7a983a4079 add keepalive 2014-12-03 13:21:30 -08:00
tqchen
8a6768763d bug fixed ver 2014-12-03 11:51:39 -08:00
tqchen
ed1de6df80 change AllReduce to Allreduce 2014-12-02 21:11:48 -08:00
tqchen
0a3300d773 rabit run on MPI 2014-12-02 11:20:19 -08:00
tqchen
dcea64c838 check in model recover 2014-12-01 21:41:37 -08:00
tqchen
255218a2f3 change in interface, seems resetlink is still bad 2014-12-01 21:39:51 -08:00
tqchen
b76cd5858c seems ok version 2014-12-01 20:18:25 -08:00
tqchen
46b5d46111 fix one bug, another comes 2014-12-01 19:53:41 -08:00
tqchen
993ff8bb91 find one bug, continue to next one 2014-12-01 19:34:27 -08:00
tqchen
337840d29b recover not yet working 2014-12-01 16:57:26 -08:00
tqchen
eb2ca06d67 fresh name fresh start 2014-12-01 09:17:05 -08:00
tqchen
8cef2086f5 smarter select for allreduce and bcast 2014-11-30 21:31:45 -08:00
tqchen
5b0bb53184 refactor code style, reset link still need thoughts 2014-11-29 20:15:27 -08:00
tqchen
42505f473d finish reset link log 2014-11-29 15:14:43 -08:00