To submit to CRAN we cannot use more than 2 threads in our examples/vignettes

2015-03-03 00:21:24 -08:00
parent 87ec48c1d3
commit 41b080e35f
36 changed files with 61 additions and 59 deletions
--- a/R-package/vignettes/discoverYourData.Rmd
+++ b/R-package/vignettes/discoverYourData.Rmd
@@ -153,7 +153,7 @@ The code below is very usual. For more information, you can look at the document

 ```{r}
 bst <- xgboost(data = sparse_matrix, label = output_vector, max.depth = 4,
-               eta = 1, nround = 10,objective = "binary:logistic")
+               eta = 1, nthread = 2, nround = 10,objective = "binary:logistic")

 ```

--- a/R-package/vignettes/xgboostPresentation.Rmd
+++ b/R-package/vignettes/xgboostPresentation.Rmd
@@ -141,10 +141,11 @@ We will train decision tree model using the following parameters:

 * `objective = "binary:logistic"`: we will train a binary classification model ;
 * `max.deph = 2`: the trees won't be deep, because our case is very simple ;
+* `nthread = 2`: the number of cpu threads we are going to use;
 * `nround = 2`: there will be two passes on the data, the second one will enhance the model by further reducing the difference between ground truth and prediction.

 ```{r trainingSparse, message=F, warning=F}
-bstSparse <- xgboost(data = train$data, label = train$label, max.depth = 2, eta = 1, nround = 2, objective = "binary:logistic")
+bstSparse <- xgboost(data = train$data, label = train$label, max.depth = 2, eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
 ```

 > More complex the relationship between your features and your `label` is, more passes you need.
@@ -156,7 +157,7 @@ bstSparse <- xgboost(data = train$data, label = train$label, max.depth = 2, eta
 Alternatively, you can put your dataset in a *dense* matrix, i.e. a basic **R** matrix.

 ```{r trainingDense, message=F, warning=F}
-bstDense <- xgboost(data = as.matrix(train$data), label = train$label, max.depth = 2, eta = 1, nround = 2, objective = "binary:logistic")
+bstDense <- xgboost(data = as.matrix(train$data), label = train$label, max.depth = 2, eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
 ```

 #### xgb.DMatrix
@@ -165,7 +166,7 @@ bstDense <- xgboost(data = as.matrix(train$data), label = train$label, max.depth

 ```{r trainingDmatrix, message=F, warning=F}
 dtrain <- xgb.DMatrix(data = train$data, label = train$label)
-bstDMatrix <- xgboost(data = dtrain, max.depth = 2, eta = 1, nround = 2, objective = "binary:logistic")
+bstDMatrix <- xgboost(data = dtrain, max.depth = 2, eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
 ```

 #### Verbose option
@@ -176,17 +177,17 @@ One of the simplest way to see the training progress is to set the `verbose` opt

 ```{r trainingVerbose0, message=T, warning=F}
 # verbose = 0, no message
-bst <- xgboost(data = dtrain, max.depth = 2, eta = 1, nround = 2, objective = "binary:logistic", verbose = 0)
+bst <- xgboost(data = dtrain, max.depth = 2, eta = 1, nthread = 2, nround = 2, objective = "binary:logistic", verbose = 0)
 ```

 ```{r trainingVerbose1, message=T, warning=F}
 # verbose = 1, print evaluation metric
-bst <- xgboost(data = dtrain, max.depth = 2, eta = 1, nround = 2, objective = "binary:logistic", verbose = 1)
+bst <- xgboost(data = dtrain, max.depth = 2, eta = 1, nthread = 2, nround = 2, objective = "binary:logistic", verbose = 1)
 ```

 ```{r trainingVerbose2, message=T, warning=F}
 # verbose = 2, also print information about tree
-bst <- xgboost(data = dtrain, max.depth = 2, eta = 1, nround = 2, objective = "binary:logistic", verbose = 2)
+bst <- xgboost(data = dtrain, max.depth = 2, eta = 1, nthread = 2, nround = 2, objective = "binary:logistic", verbose = 2)
 ```

 Basic prediction using Xgboost
@@ -279,7 +280,7 @@ For the purpose of this example, we use `watchlist` parameter. It is a list of `
 ```{r watchlist, message=F, warning=F}
 watchlist <- list(train=dtrain, test=dtest)

-bst <- xgb.train(data=dtrain, max.depth=2, eta=1, nround=2, watchlist=watchlist, objective = "binary:logistic")
+bst <- xgb.train(data=dtrain, max.depth=2, eta=1, nthread = 2, nround=2, watchlist=watchlist, objective = "binary:logistic")
 ```

 **Xgboost** has computed at each round the same average error metric than seen above (we set `nround` to 2, that is why we have two lines). Obviously, the `train-error` number is related to the training dataset (the one the algorithm learns from) and the `test-error` number to the test dataset. 
@@ -291,7 +292,7 @@ If with your own dataset you have not such results, you should think about how y
 For a better understanding of the learning progression, you may want to have some specific metric or even use multiple evaluation metrics.

 ```{r watchlist2, message=F, warning=F}
-bst <- xgb.train(data=dtrain, max.depth=2, eta=1, nround=2, watchlist=watchlist, eval.metric = "error", eval.metric = "logloss", objective = "binary:logistic")
+bst <- xgb.train(data=dtrain, max.depth=2, eta=1, nthread = 2, nround=2, watchlist=watchlist, eval.metric = "error", eval.metric = "logloss", objective = "binary:logistic")
 ```

 > `eval.metric` allows us to monitor two new metrics for each round, `logloss` and `error`.
@@ -302,7 +303,7 @@ Linear boosting
 Until know, all the learnings we have performed were based on boosting trees. **Xgboost** implements a second algorithm, based on linear boosting. The only difference with previous command is `booster = "gblinear"` parameter (and removing `eta` parameter).

 ```{r linearBoosting, message=F, warning=F}
-bst <- xgb.train(data=dtrain, booster = "gblinear", max.depth=2, nround=2, watchlist=watchlist, eval.metric = "error", eval.metric = "logloss", objective = "binary:logistic")
+bst <- xgb.train(data=dtrain, booster = "gblinear", max.depth=2, nthread = 2, nround=2, watchlist=watchlist, eval.metric = "error", eval.metric = "logloss", objective = "binary:logistic")
 ```

 In this specific case, *linear boosting* gets sligtly better performance metrics than decision trees based algorithm. 
@@ -320,7 +321,7 @@ Like saving models, `xgb.DMatrix` object (which groups both dataset and outcome)
 xgb.DMatrix.save(dtrain, "dtrain.buffer")
 # to load it in, simply call xgb.DMatrix
 dtrain2 <- xgb.DMatrix("dtrain.buffer")
-bst <- xgb.train(data=dtrain2, max.depth=2, eta=1, nround=2, watchlist=watchlist, objective = "binary:logistic")
+bst <- xgb.train(data=dtrain2, max.depth=2, eta=1, nthread = 2, nround=2, watchlist=watchlist, objective = "binary:logistic")
 ```

 ```{r DMatrixDel, include=FALSE}