Use bst_float consistently throughout (#1824)

* Fix various typos * Add override to functions that are overridden gcc gives warnings about functions that are being overridden by not being marked as oveirridden. This fixes it. * Use bst_float consistently Use bst_float for all the variables that involve weight, leaf value, gradient, hessian, gain, loss_chg, predictions, base_margin, feature values. In some cases, when due to additions and so on the value can take a larger value, double is used. This ensures that type conversions are minimal and reduces loss of precision.
2016-11-30 23:32:10 +05:30
parent da2556f58a
commit 6f16f0ef58
50 changed files with 392 additions and 389 deletions
--- a/demo/distributed-training/plot_model.ipynb
+++ b/demo/distributed-training/plot_model.ipynb
@@ -6,7 +6,7 @@
   "source": [
    "# XGBoost Model Analysis\n",
    "\n",
-    "This notebook can be used to load and anlysis model learnt from all xgboost bindings, including distributed training. "
+    "This notebook can be used to load and analysis model learnt from all xgboost bindings, including distributed training. "
   ]
  },
  {
--- a/demo/guide-python/custom_objective.py
+++ b/demo/guide-python/custom_objective.py
@@ -27,9 +27,9 @@ def logregobj(preds, dtrain):

 # user defined evaluation function, return a pair metric_name, result
 # NOTE: when you do customized loss function, the default prediction value is margin
-# this may make buildin evalution metric not function properly
+# this may make builtin evaluation metric not function properly
 # for example, we are doing logistic loss, the prediction is score before logistic transformation
-# the buildin evaluation error assumes input is after logistic transformation
+# the builtin evaluation error assumes input is after logistic transformation
 # Take this in mind when you use the customization, and maybe you need write customized evaluation function
 def evalerror(preds, dtrain):
    labels = dtrain.get_label()
--- a/demo/kaggle-higgs/higgs-numpy.py
+++ b/demo/kaggle-higgs/higgs-numpy.py
@@ -44,7 +44,7 @@ param['nthread'] = 16
 plst = list(param.items())+[('eval_metric', 'ams@0.15')]

 watchlist = [ (xgmat,'train') ]
-# boost 120 tres
+# boost 120 trees
 num_round = 120
 print ('loading data end, start to boost trees')
 bst = xgb.train( plst, xgmat, num_round, watchlist );
--- a/demo/kaggle-higgs/speedtest.py
+++ b/demo/kaggle-higgs/speedtest.py
@@ -42,7 +42,7 @@ param['nthread'] = 4
 plst = param.items()+[('eval_metric', 'ams@0.15')]

 watchlist = [ (xgmat,'train') ]
-# boost 10 tres
+# boost 10 trees
 num_round = 10
 print ('loading data end, start to boost trees')
 print ("training GBM from sklearn")
--- a/demo/kaggle-otto/otto_train_pred.R
+++ b/demo/kaggle-otto/otto_train_pred.R
@@ -8,7 +8,7 @@ test = test[,-1]

 y = train[,ncol(train)]
 y = gsub('Class_','',y)
-y = as.integer(y)-1 #xgboost take features in [0,numOfClass)
+y = as.integer(y)-1  # xgboost take features in [0,numOfClass)

 x = rbind(train[,-ncol(train)],test)
 x = as.matrix(x)
@@ -22,7 +22,7 @@ param <- list("objective" = "multi:softprob",
              "num_class" = 9,
              "nthread" = 8)

-# Run Cross Valication
+# Run Cross Validation
 cv.nround = 50
 bst.cv = xgb.cv(param=param, data = x[trind,], label = y, 
                nfold = 3, nrounds=cv.nround)
--- a/demo/kaggle-otto/understandingXGBoostModel.Rmd
+++ b/demo/kaggle-otto/understandingXGBoostModel.Rmd
@@ -16,7 +16,7 @@ Introduction
 While XGBoost is known for its fast speed and accurate predictive power, it also comes with various functions to help you understand the model.
 The purpose of this RMarkdown document is to demonstrate how easily we can leverage the functions already implemented in **XGBoost R** package. Of course, everything showed below can be applied to the dataset you may have to manipulate at work or wherever!

-First we will prepare the **Otto** dataset and train a model, then we will generate two vizualisations to get a clue of what is important to the model, finally, we will see how we can leverage these information.
+First we will prepare the **Otto** dataset and train a model, then we will generate two visualisations to get a clue of what is important to the model, finally, we will see how we can leverage these information.

 Preparation of the data
 =======================