Fix for CRAN Submission (#1826)
* fix cran check * change required R version because of utils::globalVariables * temporary commit, monotone not working * fix test * fix doc * fix doc * fix cran note and warning * improve checks * fix urls
This commit is contained in:
@@ -14,7 +14,7 @@ xgb.importance(feature_names = NULL, model = NULL, data = NULL,
|
||||
|
||||
\item{data}{the dataset used for the training step. Will be used with \code{label} parameter for co-occurence computation. More information in \code{Detail} part. This parameter is optional.}
|
||||
|
||||
\item{label}{the label vetor used for the training step. Will be used with \code{data} parameter for co-occurence computation. More information in \code{Detail} part. This parameter is optional.}
|
||||
\item{label}{the label vector used for the training step. Will be used with \code{data} parameter for co-occurence computation. More information in \code{Detail} part. This parameter is optional.}
|
||||
|
||||
\item{target}{a function which returns \code{TRUE} or \code{1} when an observation should be count as a co-occurence and \code{FALSE} or \code{0} otherwise. Default function is provided for computing co-occurences in a binary classification. The \code{target} function should have only one parameter. This parameter will be used to provide each important feature vector after having applied the split condition, therefore these vector will be only made of 0 and 1 only, whatever was the information before. More information in \code{Detail} part. This parameter is optional.}
|
||||
}
|
||||
@@ -28,7 +28,7 @@ Create a \code{data.table} of the most important features of a model.
|
||||
This function is for both linear and tree models.
|
||||
|
||||
\code{data.table} is returned by the function.
|
||||
The columns are :
|
||||
The columns are:
|
||||
\itemize{
|
||||
\item \code{Features} name of the features as provided in \code{feature_names} or already present in the model dump;
|
||||
\item \code{Gain} contribution of each feature to the model. For boosted tree model, each gain of each feature of each tree is taken into account, then average per feature to give a vision of the entire model. Highest percentage means important feature to predict the \code{label} used for the training (only available for tree models);
|
||||
@@ -47,7 +47,7 @@ The gain gives you indication about the information of how a feature is importan
|
||||
|
||||
Co-occurence computation is here to help in understanding this relation between a predictor and a specific class. It will count how many observations are returned as \code{TRUE} by the \code{target} function (see parameters). When you execute the example below, there are 92 times only over the 3140 observations of the train dataset where a mushroom have no odor and can be eaten safely.
|
||||
|
||||
If you need to remember one thing only: until you want to leave us early, don't eat a mushroom which has no odor :-)
|
||||
If you need to remember only one thing: unless you want to leave us early, don't eat a mushroom which has no odor :-)
|
||||
}
|
||||
\examples{
|
||||
data(agaricus.train, package='xgboost')
|
||||
@@ -58,7 +58,8 @@ bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_dep
|
||||
xgb.importance(colnames(agaricus.train$data), model = bst)
|
||||
|
||||
# Same thing with co-occurence computation this time
|
||||
xgb.importance(colnames(agaricus.train$data), model = bst, data = agaricus.train$data, label = agaricus.train$label)
|
||||
xgb.importance(colnames(agaricus.train$data), model = bst,
|
||||
data = agaricus.train$data, label = agaricus.train$label)
|
||||
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user