add Dart tutorial

2016-06-12 19:06:28 +09:00 · 2016-06-12 19:06:28 +09:00 · c332eb5a2b
commit c332eb5a2b
parent f14c160f4f
4 changed files with 112 additions and 11 deletions
--- a/2
+++ b/2
@ -1 +1 @@
-Subproject commit 9fd3b48462a7a651e12a197679f71e043dcb25a2
+Subproject commit c39001019e443c7a061789bd1180f58ce85fc3e6
--- a/doc/parameter.md
+++ b/doc/parameter.md
@ -13,8 +13,7 @@ In R-package, you can use .(dot) to replace under score in the parameters, for e
 General Parameters
 ------------------
 * booster [default=gbtree]
-  - which booster to use, can be gbtree, gblinear or dart.
+  - which booster to use, can be gbtree, gblinear or dart. gbtree and dart use tree based model while gblinear uses linear function.
  　gbtree and dart use tree based model while gblinear uses linear function.
 * silent [default=0]
  - 0 means printing running messages, 1 means silent mode.
 * nthread [default to maximum number of threads available if not set]
@ -81,20 +80,20 @@ Additional parameters for Dart Booster
  - type of sampling algorithm.
    - "uniform": dropped trees are selected uniformly.
    - "weighted": dropped trees are selected in proportion to weight.
-* normalize_type [default="tree]
+* normalize_type [default="tree"]
  - type of normalization algorithm.
-    - "tree": New trees have the same weight of each of dropped trees.
+    - "tree": new trees have the same weight of each of dropped trees.
-              weight of new trees are 1 / (k + learnig_rate)
+      - weight of new trees are 1 / (k + learnig_rate)
-              dropped trees are scaled by a factor of k / (k + learning_rate)
+      - dropped trees are scaled by a factor of k / (k + learning_rate)
-    - "forest": New trees have the same weight of sum of dropped trees (forest).
+    - "forest": new trees have the same weight of sum of dropped trees (forest).
-                weight of new trees are 1 / (1 + learning_rate)
+      - weight of new trees are 1 / (1 + learning_rate)
-                dropped trees are scaled by a factor of 1 / (1 + learning_rate)
+      - dropped trees are scaled by a factor of 1 / (1 + learning_rate)
 * rate_drop [default=0.0]
  - dropout rate.
  - range: [0.0, 1.0]
 * skip_drop [default=0.0]
  - probability of skip dropout.
-    If a dropout is skipped, new trees are added in the same manner as gbtree.
+    - If a dropout is skipped, new trees are added in the same manner as gbtree.
  - range: [0.0, 1.0]
 Parameters for Linear Booster
--- a/doc/tutorials/dart.md
+++ b/doc/tutorials/dart.md
@ -0,0 +1,101 @@
 DART booster
 ============
 [XGBoost](https://github.com/dmlc/xgboost)) mostly combines a huge number of regression trees with small learning rate.
 In this situation, trees added early are significance and trees added late are unimportant.
 Rasmi et.al proposed a new method to add dropout techniques from deep neural nets community to boosted trees, and reported better results in some situations.
 This is a instruction of new tree booster `dart`.
 Original paper
 --------------
 Rashmi Korlakai Vinayak, Ran Gilad-Bachrach. "DART: Dropouts meet Multiple Additive Regression Trees." [JMLR](http://www.jmlr.org/proceedings/papers/v38/korlakaivinayak15.pdf)
 Features
 --------
 - Drop trees in order to solve the over-fitting.
  - Trivial trees (to correct trivial errors) may be prevented.
 Because the randomness introduced in the training, expect the following few difference.
 - Training can be slower than `gbtree` because the random dropout prevents usage of prediction buffer.
 - The early stop might not be stable, due to the randomness.
 How it works
 ------------
 - In ``$ m $``th training round, suppose ``$ k $`` trees are selected drop.
 - Let ``$ D = \sum_{i \in \mathbf{K}} F_i $`` be leaf scores of dropped trees and ``$ F_m = \eta \tilde{F}_m $`` be leaf scores of a new tree.
 - The objective function is following:
 ```math
 \mathrm{Obj}
 = \sum_{j=1}^n L \left( y_j, \hat{y}_j^{m-1} - D_j + \tilde{F}_m \right)
 + \Omega \left( \tilde{F}_m \right).
 ```
 - ``$ D $`` and ``$ F_m $`` are overshooting, so using scale factor
 ```math
 \hat{y}_j^m = \sum_{i \not\in \mathbf{K}} F_i + a \left( \sum_{i \in \mathbf{K}} F_i + b F_m \right) .
 ```
 Parameters
 ----------
 ### booster
 * `dart`
 This booster inherits `gbtree`, so `dart` has also `eta`, `gamma`, `max_depth` and so on.
 Additional parameters are noted below.
 ### sample_type
 type of sampling algorithm.
 * `uniform`: (default) dropped trees are selected uniformly.
 * `weighted`: dropped trees are selected in proportion to weight.
 ### normalize_type
 type of normalization algorithm.
 * `tree`: (default) New trees have the same weight of each of dropped trees.
 ```math
 a \left( \sum_{i \in \mathbf{K}} F_i + \frac{1}{k} F_m \right)
 &= a \left( \sum_{i \in \mathbf{K}} F_i + \frac{\eta}{k} \tilde{F}_m \right) \\
 &\sim a \left( 1 + \frac{\eta}{k} \right) D \\
 &= a \frac{k + \eta}{k} D = D , \\
 &\quad a = \frac{k}{k + \eta} .
 ```
 * `forest`: New trees have the same weight of sum of dropped trees (forest).
 ```math
 a \left( \sum_{i \in \mathbf{K}} F_i + F_m \right)
 &= a \left( \sum_{i \in \mathbf{K}} F_i + \eta \tilde{F}_m \right) \\
 &\sim a \left( 1 + \eta \right) D \\
 &= a (1 + \eta) D = D , \\
 &\quad a = \frac{1}{1 + \eta} .
 ```
 ### rate_drop
 dropout rate.
 - range: [0.0, 1.0]
 ### skip_drop
 probability of skipping dropout.
 - If a dropout is skipped, new trees are added in the same manner as gbtree.
 - range: [0.0, 1.0]
 Sample Script
 -------------
 ```python
 import xgboost as xgb
 # read in data
 dtrain = xgb.DMatrix('demo/data/agaricus.txt.train')
 dtest = xgb.DMatrix('demo/data/agaricus.txt.test')
 # specify parameters via map
 param = {'booster': 'dart',
         'max_depth': 5, 'learning_rate': 0.1,
         'objective': 'binary:logistic', 'silent': True,
         'sample_type': 'uniform',
         'normalize_type': 'tree',
         'rate_drop': 0.1,
         'skip_drop': 0.5}
 num_round = 50
 bst = xgb.train(param, dtrain, num_round)
 # make prediction
 # ntree_limit must not be 0
 preds = bst.predict(dtest, ntree_limit=num_round)
 ```
--- a/doc/tutorials/index.md
+++ b/doc/tutorials/index.md
@ -6,3 +6,4 @@ See [Awesome XGBoost](https://github.com/dmlc/xgboost/tree/master/demo) for link
 ## Contents
 - [Introduction to Boosted Trees](../model.md)
 - [Distributed XGBoost YARN on AWS](aws_yarn.md)
 - [DART booster](dart.md)
		`@ -1 +1 @@`
			`Subproject commit 9fd3b48462a7a651e12a197679f71e043dcb25a2`				`Subproject commit c39001019e443c7a061789bd1180f58ce85fc3e6`