128 Commits

Author SHA1 Message Date
tqchen
6b18ee9edb Merge branch 'master' of ssh://github.com/tqchen/rabit 2014-12-18 19:02:05 -08:00
tqchen
c8faed0b54 pass local model recover test 2014-12-18 18:53:58 -08:00
tqchen
dbd05a65b5 nice fix, start check local check 2014-12-18 18:39:24 -08:00
Tianqi Chen
31403a41cd Update rabit.h 2014-12-09 21:03:41 -08:00
tqchen
3f22596e3c check in license 2014-12-09 20:57:54 -08:00
tqchen
cc5efb8d81 Merge branch 'master' of ssh://github.com/tqchen/rabit 2014-12-09 20:56:33 -08:00
root
5aff7fab29 adding : 2014-12-08 17:15:49 +00:00
root
dfb3961eea changing port 2014-12-08 17:13:42 +00:00
Tianqi Chen
39f2dcdfef Update rabit_tracker.py 2014-12-08 08:36:55 -08:00
tqchen
2750679270 normal state running ok 2014-12-07 20:57:29 -08:00
tqchen
b38fa40fa6 fix ring passing 2014-12-07 20:25:42 -08:00
tqchen
8d570b54c7 add code to help link reuse, start test numreplica 2014-12-07 16:22:02 -08:00
tqchen
e2adce1cc1 add ring setup version 2014-12-07 16:09:28 -08:00
tqchen
322e40c72e Merge branch 'master' of ssh://github.com/tqchen/rabit 2014-12-06 23:00:18 -08:00
tqchen
328cf187ba check in the ring passing 2014-12-06 23:00:10 -08:00
nachocano
20b03e781c to run all executables 2014-12-06 15:37:09 -08:00
nachocano
fcf2f0a03d to stderr 2014-12-06 15:22:29 -08:00
nachocano
cd8ab469ff Merge branch 'master' of https://github.com/tqchen/allreduce 2014-12-06 15:14:19 -08:00
nachocano
659b9cd517 changing number of repetitions 2014-12-06 15:14:14 -08:00
root
52d472c209 using hostfile 2014-12-06 20:30:35 +00:00
nachocano
9ed59e71f6 speed runner 2014-12-06 12:09:40 -08:00
nachocano
e0053c62e1 adding executable 2014-12-06 12:05:08 -08:00
nachocano
8f0d7d1d3e changing to -ho not to conflict with help 2014-12-06 12:01:05 -08:00
nachocano
771891491c Merge branch 'master' of https://github.com/tqchen/allreduce 2014-12-06 11:59:22 -08:00
nachocano
f203d13efc speed runner 2014-12-06 11:59:16 -08:00
nachocano
14e400226a submit mpi to include machine file 2014-12-06 11:33:05 -08:00
tqchen
58f80c5675 Merge branch 'master' of ssh://github.com/tqchen/rabit 2014-12-06 11:25:18 -08:00
tqchen
4a7d84e861 chg string bcast 2014-12-06 11:25:08 -08:00
tqchen
1519f74f3c ok 2014-12-06 11:20:52 -08:00
tqchen
0e012cb05e add speed test 2014-12-06 11:05:24 -08:00
tqchen
19631ecef6 more tracker renaming 2014-12-06 09:24:12 -08:00
tqchen
a569bf2698 change gitignore 2014-12-06 09:19:08 -08:00
tqchen
dc12958fc7 rename master to tracker, to emphasie rabit is p2p in computing 2014-12-06 09:15:31 -08:00
nachocano
67b68ceae6 adding timing 2014-12-05 16:00:47 -08:00
nachocano
54eb5623cb worked on my machine !!! finally 2014-12-05 15:24:00 -08:00
nachocano
d9c22e54de closer, but still does not work... stays in map 100%. I think an exception is being thrown 2014-12-05 13:28:42 -08:00
tqchen
7765e2dc55 add status report 2014-12-05 09:49:26 -08:00
tqchen
ab278513ab ok 2014-12-05 09:39:51 -08:00
Tianqi Chen
e7a22792ac Update submit_job_hadoop.py 2014-12-05 09:14:44 -08:00
Tianqi Chen
e05098cacb Update submit_job_hadoop.py 2014-12-05 09:10:26 -08:00
Tianqi Chen
f9e95ab522 Update submit_job_hadoop.py 2014-12-05 09:09:20 -08:00
nachocano
bb7d6814a7 creating initial version of hadoop submit script. Not working.
Not sure how to get the master uri and port. I believe I cannot do it before I launch the job.

Updating the name from submit_job to submit_job_mpi
2014-12-05 03:27:02 -08:00
nachocano
e00fb99e7b cosmetic 2014-12-04 19:02:11 -08:00
nachocano
e9a3f5169e cosmetic changes 2014-12-04 18:02:07 -08:00
tqchen
1af3e81ada chg robust to reliable 2014-12-04 17:32:22 -08:00
tqchen
7cd5474f1a chg interface 2014-12-04 17:31:40 -08:00
tqchen
821eb21ae2 before make rabit public 2014-12-04 17:30:58 -08:00
tqchen
cc410b8c90 add local model in checkpoint interface, a new goal 2014-12-04 11:09:15 -08:00
tqchen
79e7862583 change note 2014-12-04 09:09:56 -08:00
tqchen
f9d634ce06 change notes 2014-12-04 09:09:29 -08:00