small changes in RMarkdown

This commit is contained in:
El Potaeto 2015-05-05 23:45:43 +02:00
parent 937a75bcb1
commit 5eeec6a33f

View File

@ -71,7 +71,7 @@ train[1:6, ncol(train), with = F]
nameLastCol <- names(train)[ncol(train)] nameLastCol <- names(train)[ncol(train)]
``` ```
The class are provided as character string in the `ncol(train)`th column called `nameLastCol`. As you may know, **XGBoost** doesn't support anything else than numbers. So we will convert classes to integers. Moreover, according to the documentation, it should start at 0. The class are provided as character string in the **`r ncol(train)`**th column called **`r nameLastCol`**. As you may know, **XGBoost** doesn't support anything else than numbers. So we will convert classes to integers. Moreover, according to the documentation, it should start at 0.
For that purpose, we will: For that purpose, we will:
@ -138,7 +138,7 @@ Model understanding
Feature importance Feature importance
------------------ ------------------
So far, we have built a model made of `nround` trees. So far, we have built a model made of **`r nround`** trees.
To build a tree, the dataset is divided recursively several times. At the end of the process, you get groups of observations (here, these observations are properties regarding **OTTO** products). To build a tree, the dataset is divided recursively several times. At the end of the process, you get groups of observations (here, these observations are properties regarding **OTTO** products).
@ -212,3 +212,11 @@ We are just displaying the first two trees here.
On simple models the first two trees may be enough. Here, it might not be the case. We can see from the size of the trees that the intersaction between features is complicated. On simple models the first two trees may be enough. Here, it might not be the case. We can see from the size of the trees that the intersaction between features is complicated.
Besides, XGBoost generate `k` trees at each round for a `k`-classification problem. Therefore the two trees illustrated here are trying to classify data into different classes. Besides, XGBoost generate `k` trees at each round for a `k`-classification problem. Therefore the two trees illustrated here are trying to classify data into different classes.
Going deeper
============
There are two documents you may want to check to go deeper:
* [discoverYourData.Rmd](https://github.com/dmlc/xgboost/blob/master/R-package/vignettes/discoverYourData.Rmd)
* [xgboostPresentation.Rmd](https://github.com/dmlc/xgboost/blob/master/R-package/vignettes/xgboostPresentation.Rmd)