Use bst_float consistently throughout (#1824)

* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
This commit is contained in:
AbdealiJK
2016-11-30 23:32:10 +05:30
committed by Tianqi Chen
parent da2556f58a
commit 6f16f0ef58
50 changed files with 392 additions and 389 deletions

View File

@@ -6,7 +6,7 @@
"source": [
"# XGBoost Model Analysis\n",
"\n",
"This notebook can be used to load and anlysis model learnt from all xgboost bindings, including distributed training. "
"This notebook can be used to load and analysis model learnt from all xgboost bindings, including distributed training. "
]
},
{

View File

@@ -27,9 +27,9 @@ def logregobj(preds, dtrain):
# user defined evaluation function, return a pair metric_name, result
# NOTE: when you do customized loss function, the default prediction value is margin
# this may make buildin evalution metric not function properly
# this may make builtin evaluation metric not function properly
# for example, we are doing logistic loss, the prediction is score before logistic transformation
# the buildin evaluation error assumes input is after logistic transformation
# the builtin evaluation error assumes input is after logistic transformation
# Take this in mind when you use the customization, and maybe you need write customized evaluation function
def evalerror(preds, dtrain):
labels = dtrain.get_label()

View File

@@ -44,7 +44,7 @@ param['nthread'] = 16
plst = list(param.items())+[('eval_metric', 'ams@0.15')]
watchlist = [ (xgmat,'train') ]
# boost 120 tres
# boost 120 trees
num_round = 120
print ('loading data end, start to boost trees')
bst = xgb.train( plst, xgmat, num_round, watchlist );

View File

@@ -42,7 +42,7 @@ param['nthread'] = 4
plst = param.items()+[('eval_metric', 'ams@0.15')]
watchlist = [ (xgmat,'train') ]
# boost 10 tres
# boost 10 trees
num_round = 10
print ('loading data end, start to boost trees')
print ("training GBM from sklearn")

View File

@@ -8,7 +8,7 @@ test = test[,-1]
y = train[,ncol(train)]
y = gsub('Class_','',y)
y = as.integer(y)-1 #xgboost take features in [0,numOfClass)
y = as.integer(y)-1 # xgboost take features in [0,numOfClass)
x = rbind(train[,-ncol(train)],test)
x = as.matrix(x)
@@ -22,7 +22,7 @@ param <- list("objective" = "multi:softprob",
"num_class" = 9,
"nthread" = 8)
# Run Cross Valication
# Run Cross Validation
cv.nround = 50
bst.cv = xgb.cv(param=param, data = x[trind,], label = y,
nfold = 3, nrounds=cv.nround)

View File

@@ -16,7 +16,7 @@ Introduction
While XGBoost is known for its fast speed and accurate predictive power, it also comes with various functions to help you understand the model.
The purpose of this RMarkdown document is to demonstrate how easily we can leverage the functions already implemented in **XGBoost R** package. Of course, everything showed below can be applied to the dataset you may have to manipulate at work or wherever!
First we will prepare the **Otto** dataset and train a model, then we will generate two vizualisations to get a clue of what is important to the model, finally, we will see how we can leverage these information.
First we will prepare the **Otto** dataset and train a model, then we will generate two visualisations to get a clue of what is important to the model, finally, we will see how we can leverage these information.
Preparation of the data
=======================