fix documentation

2015-02-07 23:53:55 +01:00
parent 85739c537d
commit 75f205b0b1
3 changed files with 72 additions and 27 deletions
--- a/R-package/R/xgboost.R
+++ b/R-package/R/xgboost.R
@@ -8,30 +8,30 @@
 #'    if data is local data file or  \code{xgb.DMatrix}. 
 #' @param params the list of parameters.
 #' 
-#' General Parameters
+#' 1. General Parameters
 #' 
 #' \itemize{
-#'   \item \code{booster} which booster to use, can be gbtree or gblinear. (default=gbtree)
-#'   \item \code{silent} 0 (default) means printing running messages, 1 means silent mode.
+#'   \item \code{booster} which booster to use, can be gbtree or gblinear. Default: gbtree
+#'   \item \code{silent} 0 means printing running messages, 1 means silent mode. Default: 0
 #'   \item \code{nthread} number of parallel threads used to run xgboost. Default to maximum number of threads available if not set.
 #'   \item \code{num_pbuffer} size of prediction buffer, normally set to number of training instances. The buffers are used to save the prediction results of last boosting step. Default: set automatically by xgboost, no need to be set by user
 #'   \item \code{num_feature} feature dimension used in boosting, set to maximum dimension of the feature. Default: set automatically by xgboost, no need to be set by user.
 #' }
 #'  
-#' Booster Parameters
+#' 2. Booster Parameters
 #' 
-#' 1. Parameter for Tree Booster
+#' 2.1. Parameter for Tree Booster
 #' 
 #' \itemize{
 #'   \item \code{eta} step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features. and eta actually shrinkage the feature weights to make the boosting process more conservative. Default: 0.3
 #'   \item \code{gamma} minimum loss reduction required to make a further partition on a leaf node of the tree. the larger, the more conservative the algorithm will be. 
 #'   \item \code{max_depth} maximum depth of a tree. Default: 6
 #'   \item \code{min_child_weight} minimum sum of instance weight(hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be. Default: 1
-#'   \item \code{subsample} subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting. Default: 1.
-#'   \item \code{colsample_bytree} subsample ratio of columns when constructing each tree. Default: 1.
+#'   \item \code{subsample} subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting. Default: 1
+#'   \item \code{colsample_bytree} subsample ratio of columns when constructing each tree. Default: 1
 #' }
 #' 
-#' 2. Parameter for Linear Booster
+#' 2.2. Parameter for Linear Booster
 #'  
 #' \itemize{
 #'   \item \code{lambda} L2 regularization term on weights. Default: 0
@@ -39,8 +39,7 @@
 #'   \item \code{alpha} L1 regularization term on weights. (there is no L1 reg on bias because it is not important). Default: 0
 #' }
 #' 
-#' Task Parameters 
-#' 
+#' 3. Task Parameters 
 #' 
 #' \itemize{
 #' \item \code{objective} specify the learning task and the corresponding learning objective, and the objective options are below:
@@ -51,21 +50,21 @@
 #'     \item \code{binary:logitraw} logistic regression for binary classification, output score before logistic transformation.
 #'     \item \code{multi:softmax} set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes).
 #'     \item \code{multi:softprob} same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata, nclass matrix. The result contains predicted probability of each data point belonging to each class.
-#'     \item \code{rank:pairwise} set XGBoost to do ranking task by minimizing the pairwise loss
+#'     \item \code{rank:pairwise} set XGBoost to do ranking task by minimizing the pairwise loss.
 #'   }
 #'   \item \code{base_score} the initial prediction score of all instances, global bias. Default: 0.5
 #'   \item \code{eval_metric} evaluation metrics for validation data, a default metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking). Default according to objective. The choices are listed below:
 #'   \itemize{
 #'      \item \code{rmse} root mean square error. \url{http://en.wikipedia.org/wiki/Root_mean_square_error}
 #'      \item \code{logloss} negative log-likelihood. \url{http://en.wikipedia.org/wiki/Log-likelihood}
-#'      \item \code{error} Binary classification error rate. It is calculated as (wrong cases)/(all cases). For the predictions, the evaluation will regard the instances with prediction value larger than 0.5 as positive instances, and the others as negative instances.
-#'      \item \code{merror} Multiclass classification error rate. It is calculated as (wrong cases)/(all cases).
+#'      \item \code{error} Binary classification error rate. It is calculated as \code{(wrong cases) / (all cases)}. For the predictions, the evaluation will regard the instances with prediction value larger than 0.5 as positive instances, and the others as negative instances.
+#'      \item \code{merror} Multiclass classification error rate. It is calculated as \code{(wrong cases) / (all cases)}.
 #'      \item \code{auc} Area under the curve. \url{http://en.wikipedia.org/wiki/Receiver_operating_characteristic#'Area_under_curve} for ranking evaluation.
 #'      \item \code{ndcg} Normalized Discounted Cumulative Gain. \url{http://en.wikipedia.org/wiki/NDCG}
 #'   }
 #'   \item \code{map} Mean average precision. \url{http://en.wikipedia.org/wiki/Mean_average_precision#'Mean_average_precision}
-#'   \item \code{ndcg@n} and \code{map@n} n can be assigned as an integer to cut off the top positions in the lists for evaluation.
-#'   \item \code{ndcg-}, \code{map-}, \code{ndcg@n-}, \code{map@n-} In XGBoost, NDCG and MAP will evaluate the score of a list without any positive samples as 1. By adding "-" in the evaluation metric XGBoost will evaluate these score as 0 to be consistent under some conditions. Training repeatively.
+#'   \item \code{ndcg@@n} and \code{map@@n} n can be assigned as an integer to cut off the top positions in the lists for evaluation.
+#'   \item \code{ndcg-}, \code{map-}, \code{ndcg@@n-}, \code{map@@n-} In XGBoost, NDCG and MAP will evaluate the score of a list without any positive samples as 1. By adding "-" in the evaluation metric XGBoost will evaluate these score as 0 to be consistent under some conditions. Training repeatively.
 #'   \item \code{seed} random number seed. Default: 0
 #' }
 #' 
--- a/R-package/man/xgb.model.dt.tree.Rd
+++ b/R-package/man/xgb.model.dt.tree.Rd
@@ -40,7 +40,6 @@ The content of the \code{data.table} is organised that way:
 \item \code{Cover}: metric to measure the number of observation affected by the split ;
 \item \code{Tree}: ID of the tree. It is included in the main ID ;
 \item \code{Yes.X} or \code{No.X}: data related to the pointer in \code{Yes} or \code{No} column ;
- \item \code{Included}:  \code{boolean} value which indicates if this feature has been pointed by a Yes branch (\code{True}) or a No branch (\code{False}). By convention stem feature is always included ;
 }
 }
 \examples{
--- a/R-package/man/xgboost.Rd
+++ b/R-package/man/xgboost.Rd
@@ -17,20 +17,67 @@ if data is local data file or  \code{xgb.DMatrix}.}
 \item{missing}{Missing is only used when input is dense matrix, pick a float
 value that represents missing value. Sometime a data use 0 or other extreme value to represents missing values.}

-\item{params}{the list of parameters. Commonly used ones are:
+\item{params}{the list of parameters.
+
+1. General Parameters
+
 \itemize{
-  \item \code{objective} objective function, common ones are
-  \itemize{
-    \item \code{reg:linear} linear regression
-    \item \code{binary:logistic} logistic regression for classification
-  }
-  \item \code{eta} step size of each boosting step
-  \item \code{max.depth} maximum depth of the tree
-  \item \code{nthread} number of thread used in training, if not set, all threads are used
+  \item \code{booster} which booster to use, can be gbtree or gblinear. Default: gbtree
+  \item \code{silent} 0 means printing running messages, 1 means silent mode. Default: 0
+  \item \code{nthread} number of parallel threads used to run xgboost. Default to maximum number of threads available if not set.
+  \item \code{num_pbuffer} size of prediction buffer, normally set to number of training instances. The buffers are used to save the prediction results of last boosting step. Default: set automatically by xgboost, no need to be set by user
+  \item \code{num_feature} feature dimension used in boosting, set to maximum dimension of the feature. Default: set automatically by xgboost, no need to be set by user.
 }

-  See \url{https://github.com/tqchen/xgboost/wiki/Parameters} for
-  further details. See also demo/ for walkthrough example in R.}
+2. Booster Parameters
+
+2.1. Parameter for Tree Booster
+
+\itemize{
+  \item \code{eta} step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features. and eta actually shrinkage the feature weights to make the boosting process more conservative. Default: 0.3
+  \item \code{gamma} minimum loss reduction required to make a further partition on a leaf node of the tree. the larger, the more conservative the algorithm will be.
+  \item \code{max_depth} maximum depth of a tree. Default: 6
+  \item \code{min_child_weight} minimum sum of instance weight(hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression mode, this simply corresponds to minimum number of instances needed to be in each node. The larger, the more conservative the algorithm will be. Default: 1
+  \item \code{subsample} subsample ratio of the training instance. Setting it to 0.5 means that XGBoost randomly collected half of the data instances to grow trees and this will prevent overfitting. Default: 1
+  \item \code{colsample_bytree} subsample ratio of columns when constructing each tree. Default: 1
+}
+
+2.2. Parameter for Linear Booster
+
+\itemize{
+  \item \code{lambda} L2 regularization term on weights. Default: 0
+  \item \code{lambda_bias} L2 regularization term on bias. Default: 0
+  \item \code{alpha} L1 regularization term on weights. (there is no L1 reg on bias because it is not important). Default: 0
+}
+
+3. Task Parameters
+
+\itemize{
+\item \code{objective} specify the learning task and the corresponding learning objective, and the objective options are below:
+  \itemize{
+    \item \code{reg:linear} linear regression (Default).
+    \item \code{reg:logistic} logistic regression.
+    \item \code{binary:logistic} logistic regression for binary classification. Output probability.
+    \item \code{binary:logitraw} logistic regression for binary classification, output score before logistic transformation.
+    \item \code{multi:softmax} set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes).
+    \item \code{multi:softprob} same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata, nclass matrix. The result contains predicted probability of each data point belonging to each class.
+    \item \code{rank:pairwise} set XGBoost to do ranking task by minimizing the pairwise loss.
+  }
+  \item \code{base_score} the initial prediction score of all instances, global bias. Default: 0.5
+  \item \code{eval_metric} evaluation metrics for validation data, a default metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking). Default according to objective. The choices are listed below:
+  \itemize{
+     \item \code{rmse} root mean square error. \url{http://en.wikipedia.org/wiki/Root_mean_square_error}
+     \item \code{logloss} negative log-likelihood. \url{http://en.wikipedia.org/wiki/Log-likelihood}
+     \item \code{error} Binary classification error rate. It is calculated as \code{(wrong cases) / (all cases)}. For the predictions, the evaluation will regard the instances with prediction value larger than 0.5 as positive instances, and the others as negative instances.
+     \item \code{merror} Multiclass classification error rate. It is calculated as \code{(wrong cases) / (all cases)}.
+     \item \code{auc} Area under the curve. \url{http://en.wikipedia.org/wiki/Receiver_operating_characteristic#'Area_under_curve} for ranking evaluation.
+     \item \code{ndcg} Normalized Discounted Cumulative Gain. \url{http://en.wikipedia.org/wiki/NDCG}
+  }
+  \item \code{map} Mean average precision. \url{http://en.wikipedia.org/wiki/Mean_average_precision#'Mean_average_precision}
+  \item \code{ndcg@n} and \code{map@n} n can be assigned as an integer to cut off the top positions in the lists for evaluation.
+  \item \code{ndcg-}, \code{map-}, \code{ndcg@n-}, \code{map@n-} In XGBoost, NDCG and MAP will evaluate the score of a list without any positive samples as 1. By adding "-" in the evaluation metric XGBoost will evaluate these score as 0 to be consistent under some conditions. Training repeatively.
+  \item \code{seed} random number seed. Default: 0
+}}

 \item{nrounds}{the max number of iterations}