add glm

2014-09-04 21:09:52 -07:00
parent f9f982a7aa
commit 512a0f69fd
4 changed files with 39 additions and 4 deletions
--- a/demo/README.md
+++ b/demo/README.md
@@ -8,10 +8,11 @@ This folder contains the all example codes using xgboost.
 Features Walkthrough
 ====
 This is a list of short codes introducing different functionalities of xgboost and its wrapper.
-* Basic walkthrough of wrappers. [python](guide-python/basic_walkthrough.py)
-* Cutomize loss function, and evaluation metric. [python](guide-python/custom_objective.py)
-* Boosting from existing prediction. [python](guide-python/boost_from_prediction.py)
-* Predicting using first n trees. [python](guide-python/predict_first_ntree.py)
+* Basic walkthrough of wrappers [python](guide-python/basic_walkthrough.py)
+* Cutomize loss function, and evaluation metric [python](guide-python/custom_objective.py)
+* Boosting from existing prediction [python](guide-python/boost_from_prediction.py)
+* Predicting using first n trees [python](guide-python/predict_first_ntree.py)
+* Generalized Linear Model [python](guide-python/generalized_linear_model.py)
 * Cross validation [python](guide-python/cross_validation.py)

 Basic Examples by Tasks
--- a/demo/guide-python/README.md
+++ b/demo/guide-python/README.md
@@ -4,4 +4,5 @@ XGBoost Python Feature Walkthrough
 * [Cutomize loss function, and evaluation metric](custom_objective.py)
 * [Boosting from existing prediction](boost_from_prediction.py)
 * [Predicting using first n trees](predict_first_ntree.py)
+* [Generalized Linear Model](generalized_linear_model.py)
 * [Cross validation](cross_validation.py)
--- a/demo/guide-python/generalized_linear_model.py
+++ b/demo/guide-python/generalized_linear_model.py
@@ -0,0 +1,32 @@
+#!/usr/bin/python
+import sys
+sys.path.append('../../wrapper')
+import xgboost as xgb
+##
+#  this script demonstrate how to fit generalized linear model in xgboost
+#  basically, we are using linear model, instead of tree for our boosters
+##
+dtrain = xgb.DMatrix('../data/agaricus.txt.train')
+dtest = xgb.DMatrix('../data/agaricus.txt.test')
+# change booster to gblinear, so that we are fitting a linear model
+# alpha is the L1 regularizer 
+# lambda is the L2 regularizer
+# you can also set lambda_bias which is L2 regularizer on the bias term
+param = {'silent':1, 'objective':'binary:logistic', 'booster':'gblinear',
+         'alpha': 0.0001, 'lambda': 1 }
+
+# normally, you do not need to set eta (step_size)
+# XGBoost uses a parallel coordinate descent algorithm (shotgun), 
+# there could be affection on convergence with parallelization on certain cases
+# setting eta to be smaller value, e.g 0.5 can make the optimization more stable
+# param['eta'] = 1 
+
+##
+# the rest of settings are the same
+##
+watchlist  = [(dtest,'eval'), (dtrain,'train')]
+num_round = 4
+bst = xgb.train(param, dtrain, num_round, watchlist)
+preds = bst.predict(dtest)
+labels = dtest.get_label()
+print ('error=%f' % ( sum(1 for i in range(len(preds)) if int(preds[i]>0.5)!=labels[i]) /float(len(preds))))
--- a/demo/guide-python/runall.sh
+++ b/demo/guide-python/runall.sh
@@ -2,5 +2,6 @@
 python basic_walkthrough.py
 python custom_objective.py
 python boost_from_prediction.py
+python generalized_linear_model.py
 python cross_validation.py
 rm -rf *~ *.model *.buffer