diff --git a/doc/parameter.md b/doc/parameter.md index a549c25c9..77d42678e 100644 --- a/doc/parameter.md +++ b/doc/parameter.md @@ -54,22 +54,22 @@ Parameters for Tree Booster * alpha [default=0] - L1 regularization term on weights, increase this value will make model more conservative. * tree_method, string [default='auto'] - - The tree constructtion algorithm used in XGBoost(see description in the [reference paper](http://arxiv.org/abs/1603.02754)) + - The tree construction algorithm used in XGBoost(see description in the [reference paper](http://arxiv.org/abs/1603.02754)) - Distributed and external memory version only support approximate algorithm. - Choices: {'auto', 'exact', 'approx'} - 'auto': Use heuristic to choose faster one. - For small to medium dataset, exact greedy will be used. - - For very large-dataset, approximate algorithm will be choosed. + - For very large-dataset, approximate algorithm will be chosen. - Because old behavior is always use exact greedy in single machine, - user will get a message when approximate algorithm is choosed to notify this choice. + user will get a message when approximate algorithm is chosen to notify this choice. - 'exact': Exact greedy algorithm. - 'approx': Approximate greedy algorithm using sketching and histogram. * sketch_eps, [default=0.03] - This is only used for approximate greedy algorithm. - This roughly translated into ```O(1 / sketch_eps)``` number of bins. - Compared to directly select number of bins, this comes with theoretical ganrantee with sketch accuracy. - - Usuaully user do not have to tune this. - but consider set to lower number for more accurate enumeration. + Compared to directly select number of bins, this comes with theoretical guarantee with sketch accuracy. + - Usually user does not have to tune this. + but consider setting to a lower number for more accurate enumeration. - range: (0, 1) * scale_pos_weight, [default=0] - Control the balance of positive and negative weights, useful for unbalanced classes. A typical value to consider: sum(negative cases) / sum(positive cases) See [Parameters Tuning](how_to/param_tuning.md) for more discussion. Also see Higgs Kaggle competition demo for examples: [R](../demo/kaggle-higgs/higgs-train.R ), [py1](../demo/kaggle-higgs/higgs-numpy.py ), [py2](../demo/kaggle-higgs/higgs-cv.py ), [py3](../demo/guide-python/cross_validation.py) @@ -121,7 +121,7 @@ Specify the learning task and the corresponding learning objective. The objectiv - "reg:gamma" --gamma regression for severity data, output mean of gamma distribution * base_score [ default=0.5 ] - the initial prediction score of all instances, global bias - - for sufficent number of iterations, changing this value will not have too much effect. + - for sufficient number of iterations, changing this value will not have too much effect. * eval_metric [ default according to objective ] - evaluation metrics for validation data, a default metric will be assigned according to objective( rmse for regression, and error for classification, mean average precision for ranking ) - User can add multiple evaluation metrics, for python user, remember to pass the metrics in as list of parameters pairs instead of map, so that latter 'eval_metric' won't override previous one @@ -137,7 +137,7 @@ Specify the learning task and the corresponding learning objective. The objectiv - "map":[Mean average precision](http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision) - "ndcg@n","map@n": n can be assigned as an integer to cut off the top positions in the lists for evaluation. - "ndcg-","map-","ndcg@n-","map@n-": In XGBoost, NDCG and MAP will evaluate the score of a list without any positive samples as 1. By adding "-" in the evaluation metric XGBoost will evaluate these score as 0 to be consistent under some conditions. -training repeatively +training repeatedly - "gamma-deviance": [residual deviance for gamma regression] * seed [ default=0 ] - random number seed.