fixed typos and sentence structure
This commit is contained in:
parent
b1f489fd8b
commit
897180b2c6
@ -1,8 +1,8 @@
|
|||||||
Notes on Parameter Tuning
|
Notes on Parameter Tuning
|
||||||
====
|
====
|
||||||
Parmaeter tuning is a dark art in machine learning, the optimal parameters
|
Parameter tuning is a dark art in machine learning, the optimal parameters
|
||||||
of a model can depend on many scenarios. So it is impossible to create a
|
of a model can depend on many scenarios. So it is impossible to create a
|
||||||
comprehensive guides for doing so.
|
comprehensive guide for doing so.
|
||||||
|
|
||||||
This document tries to provide some guideline for parameters in xgboost.
|
This document tries to provide some guideline for parameters in xgboost.
|
||||||
|
|
||||||
@ -11,11 +11,11 @@ Understanding Bias-Variance Tradeoff
|
|||||||
====
|
====
|
||||||
If you take a machine learning or statistics course, this is likely to be one
|
If you take a machine learning or statistics course, this is likely to be one
|
||||||
of the most important concepts.
|
of the most important concepts.
|
||||||
When we allow the model to get more complicated(e.g. more depth), the model
|
When we allow the model to get more complicated (e.g. more depth), the model
|
||||||
have better ability to fit the training data, resulting a less biased model.
|
has better ability to fit the training data, resulting in a less biased model.
|
||||||
However, such complicated more requires more data to fit.
|
However, such complicated model requires more data to fit.
|
||||||
|
|
||||||
Most of parameters in xgboost is about bias variance tradeoff. The best model
|
Most of parameters in xgboost are about bias variance tradeoff. The best model
|
||||||
should trade the model complexity with its predictive power carefully.
|
should trade the model complexity with its predictive power carefully.
|
||||||
[Parameters Documentation](parameter.md) will tell you whether each parameter
|
[Parameters Documentation](parameter.md) will tell you whether each parameter
|
||||||
will make the model more conservative or not. This can be used to help you
|
will make the model more conservative or not. This can be used to help you
|
||||||
@ -23,8 +23,7 @@ turn the knob between complicated model and simple model.
|
|||||||
|
|
||||||
Control Overfitting
|
Control Overfitting
|
||||||
====
|
====
|
||||||
When you observe high training accuracy, but low tests accuracy.
|
When you observe high training accuracy, but low tests accuracy, it is likely that you encounter overfitting problem.
|
||||||
It is likely that you encounter overfitting problem.
|
|
||||||
|
|
||||||
There are in general two ways that you can control overfitting in xgboost
|
There are in general two ways that you can control overfitting in xgboost
|
||||||
* The first way is to directly control model complexity
|
* The first way is to directly control model complexity
|
||||||
@ -35,11 +34,11 @@ There are in general two ways that you can control overfitting in xgboost
|
|||||||
|
|
||||||
Handle Imbalanced Dataset
|
Handle Imbalanced Dataset
|
||||||
===
|
===
|
||||||
For common caes such as ads clickthrough log. The dataset is extremely imbalanced.
|
For common cases such as ads clickthrough log, the dataset is extremely imbalanced.
|
||||||
This can affect the training of xgboost model, and there are two ways to improve it.
|
This can affect the training of xgboost model, and there are two ways to improve it.
|
||||||
* If you care only about the ranking order (AUC) of your prediction
|
* If you care only about the ranking order (AUC) of your prediction
|
||||||
- Balance the positive and negative weight, via ```scale_pos_weight```
|
- Balance the positive and negative weights, via ```scale_pos_weight```
|
||||||
- Use AUC for evaluation
|
- Use AUC for evaluation
|
||||||
* If you care about predicting the right probability
|
* If you care about predicting the right probability
|
||||||
- In such case, yuo cannot re-balance the dataset
|
- In such a case, you cannot re-balance the dataset
|
||||||
- In such case, set parameter ```max_delta_step``` to a finite number (say 1) will help convergence
|
- In such a case, set parameter ```max_delta_step``` to a finite number (say 1) will help convergence
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user