Files

tqchen 75bf97b575 Squashed 'subtree/rabit/' changes from 091634b..59e63bc

59e63bc minor
6233050 ok
14477f9 add namenode
75a6d34 add libhdfs opts
e3c76bf minmum fix
8b3c435 chg
2035799 test code
7751b2b add debug
7690313 ok
bd346b4 ok
faba1dc add testload
6f7783e add testload
e5f0340 ok
3ed9ec8 chg
e552ac4 ask for more ram in am
b2505e3 only stop nm when sucess
bc696c9 add queue info
f3e867e add option queue
5dc843c refactor fileio
cd9c81b quick fix
1e23af2 add virtual destructor to iseekstream
f165ffb fix hdfs
8cc6508 allow demo to pass in env
fad4d69 ok
0fd6197 fix more
7423837 fix more
d25de54 add temporal solution, run_yarn_prog.py
e5a9e31 final attempt
ed3bee8 add command back
0774000 add hdfs to resource
9b66e7e fix hadoop
6812f14 ok
08e1c16 change hadoop prefix back to hadoop home
d6b6828 Update build.sh
146e069 bugfix: logical boundary for ring buffer
19cb685 ok
4cf3c13 Merge branch 'master' of ssh://github.com/tqchen/rabit
20daddb add tracker
c57dad8 add ringbased passing and batch schedule
295d8a1 update
994cb02 add sge
014c866 OK

git-subtree-dir: subtree/rabit
git-subtree-split: 59e63bc135

2015-03-21 00:44:31 -07:00

.gitignore

Squashed 'subtree/rabit/' changes from d4ec037..28ca7be

2015-03-09 13:28:38 -07:00

linear.cc

Squashed 'subtree/rabit/' changes from 091634b..59e63bc

2015-03-21 00:44:31 -07:00

linear.h

Squashed 'subtree/rabit/' changes from 091634b..59e63bc

2015-03-21 00:44:31 -07:00

Makefile

Squashed 'subtree/rabit/' changes from 091634b..59e63bc

2015-03-21 00:44:31 -07:00

README.md

Squashed 'subtree/rabit/' changes from d4ec037..28ca7be

2015-03-09 13:28:38 -07:00

run-hadoop-old.sh

Squashed 'subtree/rabit/' changes from d4ec037..28ca7be

2015-03-09 13:28:38 -07:00

run-linear-mock.sh

Squashed 'subtree/rabit/' changes from d4ec037..28ca7be

2015-03-09 13:28:38 -07:00

run-linear.sh

Squashed 'subtree/rabit/' changes from d4ec037..28ca7be

2015-03-09 13:28:38 -07:00

run-yarn.sh

Squashed 'subtree/rabit/' changes from 091634b..59e63bc

2015-03-21 00:44:31 -07:00

README.md

Linear and Logistic Regression

input format: LibSVM
Local Example: run-linear.sh
Runnig on YARN: run-yarn.sh
- You will need to have YARN
- Modify ../make/config.mk to set USE_HDFS=1 to compile with HDFS support
- Run build.sh on ../../yarn on to build yarn jar file

Multi-Threading Optimization

The code can be multi-threaded, we encourage you to use it
- Simply add nthread=k where k is the number of threads you want to use
If you submit with YARN
- Use --vcores and -mem to request CPU and memory resources
- Some scheduler in YARN do not honor CPU request, you can request more memory to grab working slots
Usually multi-threading improves speed in general
- You can use less workers and assign more resources to each of worker
- This usually means less communication overhead and faster running time

Parameters

All the parameters can be set by param=value

Important Parameters

objective [default = logistic]
- can be linear or logistic
base_score [default = 0.5]
- global bias, recommended set to mean value of label
reg_L1 [default = 0]
- l1 regularization co-efficient
reg_L2 [default = 1]
- l2 regularization co-efficient
lbfgs_stop_tol [default = 1e-5]
- relative tolerance level of loss reduction with respect to initial loss
max_lbfgs_iter [default = 500]
- maximum number of lbfgs iterations

min_lbfgs_iter [default = 5]
- minimum number of lbfgs iterations
max_linesearch_iter [default = 100]
- maximum number of iterations in linesearch
linesearch_c1 [default = 1e-4]
- c1 co-efficient in backoff linesearch
linesarch_backoff [default = 0.5]
- backoff ratio in linesearch

README.md

Linear and Logistic Regression

Multi-Threading Optimization

Parameters

Important Parameters

Optimization Related parameters