[R] maintenance Nov 2017; SHAP plots (#2888)
* [R] fix predict contributions for data with no colnames * [R] add a render parameter for xgb.plot.multi.trees; fixes #2628 * [R] update Rd's * [R] remove unnecessary dep-package from R cmake install * silence type warnings; readability * [R] silence complaint about incomplete line at the end * [R] initial version of xgb.plot.shap() * [R] more work on xgb.plot.shap * [R] enforce black font in xgb.plot.tree; fixes #2640 * [R] if feature names are available, check in predict that they are the same; fixes #2857 * [R] cran check and lint fixes * remove tabs * [R] add references; a test for plot.shap
This commit is contained in:
committed by
Tong He
parent
1b77903eeb
commit
e8a6597957
@@ -7,7 +7,7 @@
|
||||
\usage{
|
||||
\method{predict}{xgb.Booster}(object, newdata, missing = NA,
|
||||
outputmargin = FALSE, ntreelimit = NULL, predleaf = FALSE,
|
||||
predcontrib = FALSE, reshape = FALSE, ...)
|
||||
predcontrib = FALSE, approxcontrib = FALSE, reshape = FALSE, ...)
|
||||
|
||||
\method{predict}{xgb.Booster.handle}(object, ...)
|
||||
}
|
||||
@@ -19,8 +19,8 @@
|
||||
\item{missing}{Missing is only used when input is dense matrix. Pick a float value that represents
|
||||
missing values in data (e.g., sometimes 0 or some other extreme value is used).}
|
||||
|
||||
\item{outputmargin}{whether the prediction should be returned in the for of original untransformed
|
||||
sum of predictions from boosting iterations' results. E.g., setting \code{outputmargin=TRUE} for
|
||||
\item{outputmargin}{whether the prediction should be returned in the for of original untransformed
|
||||
sum of predictions from boosting iterations' results. E.g., setting \code{outputmargin=TRUE} for
|
||||
logistic regression would result in predictions for log-odds instead of probabilities.}
|
||||
|
||||
\item{ntreelimit}{limit the number of model's trees or boosting iterations used in prediction (see Details).
|
||||
@@ -30,24 +30,26 @@ It will use all the trees by default (\code{NULL} value).}
|
||||
|
||||
\item{predcontrib}{whether to return feature contributions to individual predictions instead (see Details).}
|
||||
|
||||
\item{reshape}{whether to reshape the vector of predictions to a matrix form when there are several
|
||||
\item{approxcontrib}{whether to use a fast approximation for feature contributions (see Details).}
|
||||
|
||||
\item{reshape}{whether to reshape the vector of predictions to a matrix form when there are several
|
||||
prediction outputs per case. This option has no effect when \code{predleaf = TRUE}.}
|
||||
|
||||
\item{...}{Parameters passed to \code{predict.xgb.Booster}}
|
||||
}
|
||||
\value{
|
||||
For regression or binary classification, it returns a vector of length \code{nrows(newdata)}.
|
||||
For multiclass classification, either a \code{num_class * nrows(newdata)} vector or
|
||||
a \code{(nrows(newdata), num_class)} dimension matrix is returned, depending on
|
||||
For multiclass classification, either a \code{num_class * nrows(newdata)} vector or
|
||||
a \code{(nrows(newdata), num_class)} dimension matrix is returned, depending on
|
||||
the \code{reshape} value.
|
||||
|
||||
When \code{predleaf = TRUE}, the output is a matrix object with the
|
||||
When \code{predleaf = TRUE}, the output is a matrix object with the
|
||||
number of columns corresponding to the number of trees.
|
||||
|
||||
When \code{predcontrib = TRUE} and it is not a multiclass setting, the output is a matrix object with
|
||||
\code{num_features + 1} columns. The last "+ 1" column in a matrix corresponds to bias.
|
||||
For a multiclass case, a list of \code{num_class} elements is returned, where each element is
|
||||
such a matrix. The contribution values are on the scale of untransformed margin
|
||||
such a matrix. The contribution values are on the scale of untransformed margin
|
||||
(e.g., for binary classification would mean that the contributions are log-odds deviations from bias).
|
||||
}
|
||||
\description{
|
||||
@@ -57,22 +59,23 @@ Predicted values based on either xgboost model or model handle object.
|
||||
Note that \code{ntreelimit} is not necessarily equal to the number of boosting iterations
|
||||
and it is not necessarily equal to the number of trees in a model.
|
||||
E.g., in a random forest-like model, \code{ntreelimit} would limit the number of trees.
|
||||
But for multiclass classification, while there are multiple trees per iteration,
|
||||
But for multiclass classification, while there are multiple trees per iteration,
|
||||
\code{ntreelimit} limits the number of boosting iterations.
|
||||
|
||||
Also note that \code{ntreelimit} would currently do nothing for predictions from gblinear,
|
||||
Also note that \code{ntreelimit} would currently do nothing for predictions from gblinear,
|
||||
since gblinear doesn't keep its boosting history.
|
||||
|
||||
One possible practical applications of the \code{predleaf} option is to use the model
|
||||
as a generator of new features which capture non-linearity and interactions,
|
||||
One possible practical applications of the \code{predleaf} option is to use the model
|
||||
as a generator of new features which capture non-linearity and interactions,
|
||||
e.g., as implemented in \code{\link{xgb.create.features}}.
|
||||
|
||||
Setting \code{predcontrib = TRUE} allows to calculate contributions of each feature to
|
||||
individual predictions. For "gblinear" booster, feature contributions are simply linear terms
|
||||
(feature_beta * feature_value). For "gbtree" booster, feature contribution is calculated
|
||||
as a sum of average contribution of that feature's split nodes across all trees to an
|
||||
individual prediction, following the idea explained in
|
||||
\url{http://blog.datadive.net/interpreting-random-forests/}.
|
||||
(feature_beta * feature_value). For "gbtree" booster, feature contributions are SHAP
|
||||
values (Lundberg 2017) that sum to the difference between the expected output
|
||||
of the model and the current prediction (where the hessian weights are used to compute the expectations).
|
||||
Setting \code{approxcontrib = TRUE} approximates these values following the idea explained
|
||||
in \url{http://blog.datadive.net/interpreting-random-forests/}.
|
||||
}
|
||||
\examples{
|
||||
## binary classification:
|
||||
@@ -82,7 +85,7 @@ data(agaricus.test, package='xgboost')
|
||||
train <- agaricus.train
|
||||
test <- agaricus.test
|
||||
|
||||
bst <- xgboost(data = train$data, label = train$label, max_depth = 2,
|
||||
bst <- xgboost(data = train$data, label = train$label, max_depth = 2,
|
||||
eta = 0.5, nthread = 2, nrounds = 5, objective = "binary:logistic")
|
||||
# use all trees by default
|
||||
pred <- predict(bst, test$data)
|
||||
@@ -98,7 +101,7 @@ str(pred_leaf)
|
||||
# the result is an nsamples X (nfeatures + 1) matrix
|
||||
pred_contr <- predict(bst, test$data, predcontrib = TRUE)
|
||||
str(pred_contr)
|
||||
# verify that contributions' sums are equal to log-odds of predictions (up to foat precision):
|
||||
# verify that contributions' sums are equal to log-odds of predictions (up to float precision):
|
||||
summary(rowSums(pred_contr) - qlogis(pred))
|
||||
# for the 1st record, let's inspect its features that had non-zero contribution to prediction:
|
||||
contr1 <- pred_contr[1,]
|
||||
@@ -137,7 +140,7 @@ bst <- xgboost(data = as.matrix(iris[, -5]), label = lb,
|
||||
pred <- predict(bst, as.matrix(iris[, -5]))
|
||||
str(pred)
|
||||
all.equal(pred, pred_labels)
|
||||
# prediction from using only 5 iterations should result
|
||||
# prediction from using only 5 iterations should result
|
||||
# in the same error as seen in iteration 5:
|
||||
pred5 <- predict(bst, as.matrix(iris[, -5]), ntreelimit=5)
|
||||
sum(pred5 != lb)/length(lb)
|
||||
@@ -158,6 +161,11 @@ err <- sapply(1:25, function(n) {
|
||||
})
|
||||
plot(err, type='l', ylim=c(0,0.1), xlab='#trees')
|
||||
|
||||
}
|
||||
\references{
|
||||
Scott M. Lundberg, Su-In Lee, "A Unified Approach to Interpreting Model Predictions", NIPS Proceedings 2017, \url{https://arxiv.org/abs/1705.07874}
|
||||
|
||||
Scott M. Lundberg, Su-In Lee, "Consistent feature attribution for tree ensembles", \url{https://arxiv.org/abs/1706.06060}
|
||||
}
|
||||
\seealso{
|
||||
\code{\link{xgb.train}}.
|
||||
|
||||
@@ -9,32 +9,32 @@ xgb.Booster.complete(object, saveraw = TRUE)
|
||||
\arguments{
|
||||
\item{object}{object of class \code{xgb.Booster}}
|
||||
|
||||
\item{saveraw}{a flag indicating whether to append \code{raw} Booster memory dump data
|
||||
\item{saveraw}{a flag indicating whether to append \code{raw} Booster memory dump data
|
||||
when it doesn't already exist.}
|
||||
}
|
||||
\value{
|
||||
An object of \code{xgb.Booster} class.
|
||||
}
|
||||
\description{
|
||||
It attempts to complete an \code{xgb.Booster} object by restoring either its missing
|
||||
It attempts to complete an \code{xgb.Booster} object by restoring either its missing
|
||||
raw model memory dump (when it has no \code{raw} data but its \code{xgb.Booster.handle} is valid)
|
||||
or its missing internal handle (when its \code{xgb.Booster.handle} is not valid
|
||||
or its missing internal handle (when its \code{xgb.Booster.handle} is not valid
|
||||
but it has a raw Booster memory dump).
|
||||
}
|
||||
\details{
|
||||
While this method is primarily for internal use, it might be useful in some practical situations.
|
||||
|
||||
E.g., when an \code{xgb.Booster} model is saved as an R object and then is loaded as an R object,
|
||||
its handle (pointer) to an internal xgboost model would be invalid. The majority of xgboost methods
|
||||
should still work for such a model object since those methods would be using
|
||||
\code{xgb.Booster.complete} internally. However, one might find it to be more efficient to call the
|
||||
its handle (pointer) to an internal xgboost model would be invalid. The majority of xgboost methods
|
||||
should still work for such a model object since those methods would be using
|
||||
\code{xgb.Booster.complete} internally. However, one might find it to be more efficient to call the
|
||||
\code{xgb.Booster.complete} function explicitely once after loading a model as an R-object.
|
||||
That would prevent further repeated implicit reconstruction of an internal booster model.
|
||||
}
|
||||
\examples{
|
||||
|
||||
data(agaricus.train, package='xgboost')
|
||||
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2,
|
||||
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2,
|
||||
eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic")
|
||||
saveRDS(bst, "xgb.model.rds")
|
||||
|
||||
|
||||
@@ -20,18 +20,18 @@ xgb.attributes(object) <- value
|
||||
|
||||
\item{name}{a non-empty character string specifying which attribute is to be accessed.}
|
||||
|
||||
\item{value}{a value of an attribute for \code{xgb.attr<-}; for \code{xgb.attributes<-}
|
||||
it's a list (or an object coercible to a list) with the names of attributes to set
|
||||
and the elements corresponding to attribute values.
|
||||
\item{value}{a value of an attribute for \code{xgb.attr<-}; for \code{xgb.attributes<-}
|
||||
it's a list (or an object coercible to a list) with the names of attributes to set
|
||||
and the elements corresponding to attribute values.
|
||||
Non-character values are converted to character.
|
||||
When attribute value is not a scalar, only the first index is used.
|
||||
Use \code{NULL} to remove an attribute.}
|
||||
}
|
||||
\value{
|
||||
\code{xgb.attr} returns either a string value of an attribute
|
||||
\code{xgb.attr} returns either a string value of an attribute
|
||||
or \code{NULL} if an attribute wasn't stored in a model.
|
||||
|
||||
\code{xgb.attributes} returns a list of all attribute stored in a model
|
||||
\code{xgb.attributes} returns a list of all attribute stored in a model
|
||||
or \code{NULL} if a model has no stored attributes.
|
||||
}
|
||||
\description{
|
||||
@@ -41,23 +41,23 @@ These methods allow to manipulate the key-value attribute strings of an xgboost
|
||||
The primary purpose of xgboost model attributes is to store some meta-data about the model.
|
||||
Note that they are a separate concept from the object attributes in R.
|
||||
Specifically, they refer to key-value strings that can be attached to an xgboost model,
|
||||
stored together with the model's binary representation, and accessed later
|
||||
stored together with the model's binary representation, and accessed later
|
||||
(from R or any other interface).
|
||||
In contrast, any R-attribute assigned to an R-object of \code{xgb.Booster} class
|
||||
would not be saved by \code{xgb.save} because an xgboost model is an external memory object
|
||||
and its serialization is handled externally.
|
||||
Also, setting an attribute that has the same name as one of xgboost's parameters wouldn't
|
||||
change the value of that parameter for a model.
|
||||
Also, setting an attribute that has the same name as one of xgboost's parameters wouldn't
|
||||
change the value of that parameter for a model.
|
||||
Use \code{\link{xgb.parameters<-}} to set or change model parameters.
|
||||
|
||||
The attribute setters would usually work more efficiently for \code{xgb.Booster.handle}
|
||||
than for \code{xgb.Booster}, since only just a handle (pointer) would need to be copied.
|
||||
That would only matter if attributes need to be set many times.
|
||||
Note, however, that when feeding a handle of an \code{xgb.Booster} object to the attribute setters,
|
||||
the raw model cache of an \code{xgb.Booster} object would not be automatically updated,
|
||||
the raw model cache of an \code{xgb.Booster} object would not be automatically updated,
|
||||
and it would be user's responsibility to call \code{xgb.save.raw} to update it.
|
||||
|
||||
The \code{xgb.attributes<-} setter either updates the existing or adds one or several attributes,
|
||||
The \code{xgb.attributes<-} setter either updates the existing or adds one or several attributes,
|
||||
but it doesn't delete the other existing attributes.
|
||||
}
|
||||
\examples{
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
\title{Project all trees on one tree and plot it}
|
||||
\usage{
|
||||
xgb.plot.multi.trees(model, feature_names = NULL, features_keep = 5,
|
||||
plot_width = NULL, plot_height = NULL, ...)
|
||||
plot_width = NULL, plot_height = NULL, render = TRUE, ...)
|
||||
}
|
||||
\arguments{
|
||||
\item{model}{produced by the \code{xgb.train} function.}
|
||||
@@ -18,41 +18,58 @@ xgb.plot.multi.trees(model, feature_names = NULL, features_keep = 5,
|
||||
|
||||
\item{plot_height}{height in pixels of the graph to produce}
|
||||
|
||||
\item{render}{a logical flag for whether the graph should be rendered (see Value).}
|
||||
|
||||
\item{...}{currently not used}
|
||||
}
|
||||
\value{
|
||||
Two graphs showing the distribution of the model deepness.
|
||||
When \code{render = TRUE}:
|
||||
returns a rendered graph object which is an \code{htmlwidget} of class \code{grViz}.
|
||||
Similar to ggplot objects, it needs to be printed to see it when not running from command line.
|
||||
|
||||
When \code{render = FALSE}:
|
||||
silently returns a graph object which is of DiagrammeR's class \code{dgr_graph}.
|
||||
This could be useful if one wants to modify some of the graph attributes
|
||||
before rendering the graph with \code{\link[DiagrammeR]{render_graph}}.
|
||||
}
|
||||
\description{
|
||||
Visualization of the ensemble of trees as a single collective unit.
|
||||
}
|
||||
\details{
|
||||
This function tries to capture the complexity of a gradient boosted tree model
|
||||
This function tries to capture the complexity of a gradient boosted tree model
|
||||
in a cohesive way by compressing an ensemble of trees into a single tree-graph representation.
|
||||
The goal is to improve the interpretability of a model generally seen as black box.
|
||||
|
||||
Note: this function is applicable to tree booster-based models only.
|
||||
|
||||
It takes advantage of the fact that the shape of a binary tree is only defined by
|
||||
its depth (therefore, in a boosting model, all trees have similar shape).
|
||||
It takes advantage of the fact that the shape of a binary tree is only defined by
|
||||
its depth (therefore, in a boosting model, all trees have similar shape).
|
||||
|
||||
Moreover, the trees tend to reuse the same features.
|
||||
|
||||
The function projects each tree onto one, and keeps for each position the
|
||||
The function projects each tree onto one, and keeps for each position the
|
||||
\code{features_keep} first features (based on the Gain per feature measure).
|
||||
|
||||
This function is inspired by this blog post:
|
||||
\url{https://wellecks.wordpress.com/2015/02/21/peering-into-the-black-box-visualizing-lambdamart/}
|
||||
}
|
||||
\examples{
|
||||
|
||||
data(agaricus.train, package='xgboost')
|
||||
|
||||
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 15,
|
||||
eta = 1, nthread = 2, nrounds = 30, objective = "binary:logistic",
|
||||
min_child_weight = 50)
|
||||
eta = 1, nthread = 2, nrounds = 30, objective = "binary:logistic",
|
||||
min_child_weight = 50, verbose = 0)
|
||||
|
||||
p <- xgb.plot.multi.trees(model = bst, feature_names = colnames(agaricus.train$data),
|
||||
features_keep = 3)
|
||||
p <- xgb.plot.multi.trees(model = bst, features_keep = 3)
|
||||
print(p)
|
||||
|
||||
\dontrun{
|
||||
# Below is an example of how to save this plot to a file.
|
||||
# Note that for `export_graph` to work, the DiagrammeRsvg and rsvg packages must also be installed.
|
||||
library(DiagrammeR)
|
||||
gr <- xgb.plot.multi.trees(model=bst, features_keep = 3, render=FALSE)
|
||||
export_graph(gr, 'tree.pdf', width=1500, height=600)
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
135
R-package/man/xgb.plot.shap.Rd
Normal file
135
R-package/man/xgb.plot.shap.Rd
Normal file
@@ -0,0 +1,135 @@
|
||||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/xgb.plot.shap.R
|
||||
\name{xgb.plot.shap}
|
||||
\alias{xgb.plot.shap}
|
||||
\title{SHAP contribution dependency plots}
|
||||
\usage{
|
||||
xgb.plot.shap(data, shap_contrib = NULL, features = NULL, top_n = 1,
|
||||
model = NULL, trees = NULL, target_class = NULL,
|
||||
approxcontrib = FALSE, subsample = NULL, n_col = 1, col = rgb(0, 0, 1,
|
||||
0.2), pch = ".", discrete_n_uniq = 5, discrete_jitter = 0.01,
|
||||
ylab = "SHAP", plot_NA = TRUE, col_NA = rgb(0.7, 0, 1, 0.6),
|
||||
pch_NA = ".", pos_NA = 1.07, plot_loess = TRUE, col_loess = 2,
|
||||
span_loess = 0.5, which = c("1d", "2d"), plot = TRUE, ...)
|
||||
}
|
||||
\arguments{
|
||||
\item{data}{data as a \code{matrix} or \code{dgCMatrix}.}
|
||||
|
||||
\item{shap_contrib}{a matrix of SHAP contributions that was computed earlier for the above
|
||||
\code{data}. When it is NULL, it is computed internally using \code{model} and \code{data}.}
|
||||
|
||||
\item{features}{a vector of either column indices or of feature names to plot. When it is NULL,
|
||||
feature importance is calculated, and \code{top_n} high ranked features are taken.}
|
||||
|
||||
\item{top_n}{when \code{features} is NULL, top_n [1, 100] most important features in a model are taken.}
|
||||
|
||||
\item{model}{an \code{xgb.Booster} model. It has to be provided when either \code{shap_contrib}
|
||||
or \code{features} is missing.}
|
||||
|
||||
\item{trees}{passed to \code{\link{xgb.importance}} when \code{features = NULL}.}
|
||||
|
||||
\item{target_class}{is only relevant for multiclass models. When it is set to a 0-based class index,
|
||||
only SHAP contributions for that specific class are used.
|
||||
If it is not set, SHAP importances are averaged over all classes.}
|
||||
|
||||
\item{approxcontrib}{passed to \code{\link{predict.xgb.Booster}} when \code{shap_contrib = NULL}.}
|
||||
|
||||
\item{subsample}{a random fraction of data points to use for plotting. When it is NULL,
|
||||
it is set so that up to 100K data points are used.}
|
||||
|
||||
\item{n_col}{a number of columns in a grid of plots.}
|
||||
|
||||
\item{col}{color of the scatterplot markers.}
|
||||
|
||||
\item{pch}{scatterplot marker.}
|
||||
|
||||
\item{discrete_n_uniq}{a maximal number of unique values in a feature to consider it as discrete.}
|
||||
|
||||
\item{discrete_jitter}{an \code{amount} parameter of jitter added to discrete features' positions.}
|
||||
|
||||
\item{ylab}{a y-axis label in 1D plots.}
|
||||
|
||||
\item{plot_NA}{whether the contributions of cases with missing values should also be plotted.}
|
||||
|
||||
\item{col_NA}{a color of marker for missing value contributions.}
|
||||
|
||||
\item{pch_NA}{a marker type for NA values.}
|
||||
|
||||
\item{pos_NA}{a relative position of the x-location where NA values are shown:
|
||||
\code{min(x) + (max(x) - min(x)) * pos_NA}.}
|
||||
|
||||
\item{plot_loess}{whether to plot loess-smoothed curves. The smoothing is only done for features with
|
||||
more than 5 distinct values.}
|
||||
|
||||
\item{col_loess}{a color to use for the loess curves.}
|
||||
|
||||
\item{span_loess}{the \code{span} paramerer in \code{\link[stats]{loess}}'s call.}
|
||||
|
||||
\item{which}{whether to do univariate or bivariate plotting. NOTE: only 1D is implemented so far.}
|
||||
|
||||
\item{plot}{whether a plot should be drawn. If FALSE, only a lits of matrices is returned.}
|
||||
|
||||
\item{...}{other parameters passed to \code{plot}.}
|
||||
}
|
||||
\value{
|
||||
In addition to producing plots (when \code{plot=TRUE}), it silently returns a list of two matrices:
|
||||
\itemize{
|
||||
\item \code{data} the values of selected features;
|
||||
\item \code{shap_contrib} the contributions of selected features.
|
||||
}
|
||||
}
|
||||
\description{
|
||||
Visualizing the SHAP feature contribution to prediction dependencies on feature value.
|
||||
}
|
||||
\details{
|
||||
These scatterplots represent how SHAP feature contributions depend of feature values.
|
||||
The similarity to partial dependency plots is that they also give an idea for how feature values
|
||||
affect predictions. However, in partial dependency plots, we usually see marginal dependencies
|
||||
of model prediction on feature value, while SHAP contribution dependency plots display the estimated
|
||||
contributions of a feature to model prediction for each individual case.
|
||||
|
||||
When \code{plot_loess = TRUE} is set, feature values are rounded to 3 significant digits and
|
||||
weighted LOESS is computed and plotted, where weights are the numbers of data points
|
||||
at each rounded value.
|
||||
|
||||
Note: SHAP contributions are shown on the scale of model margin. E.g., for a logistic binomial objective,
|
||||
the margin is prediction before a sigmoidal transform into probability-like values.
|
||||
Also, since SHAP stands for "SHapley Additive exPlanation" (model prediction = sum of SHAP
|
||||
contributions for all features + bias), depending on the objective used, transforming SHAP
|
||||
contributions for a feature from the marginal to the prediction space is not necessarily
|
||||
a meaningful thing to do.
|
||||
}
|
||||
\examples{
|
||||
|
||||
data(agaricus.train, package='xgboost')
|
||||
data(agaricus.test, package='xgboost')
|
||||
|
||||
bst <- xgboost(agaricus.train$data, agaricus.train$label, nrounds = 50,
|
||||
eta = 0.1, max_depth = 3, subsample = .5,
|
||||
method = "hist", objective = "binary:logistic", nthread = 2, verbose = 0)
|
||||
|
||||
xgb.plot.shap(agaricus.test$data, model = bst, features = "odor=none")
|
||||
contr <- predict(bst, agaricus.test$data, predcontrib = TRUE)
|
||||
xgb.plot.shap(agaricus.test$data, contr, model = bst, top_n = 12, n_col = 3)
|
||||
|
||||
# multiclass example - plots for each class separately:
|
||||
nclass <- 3
|
||||
nrounds <- 20
|
||||
x <- as.matrix(iris[, -5])
|
||||
set.seed(123)
|
||||
is.na(x[sample(nrow(x) * 4, 30)]) <- TRUE # introduce some missing values
|
||||
mbst <- xgboost(data = x, label = as.numeric(iris$Species) - 1, nrounds = nrounds,
|
||||
max_depth = 2, eta = 0.3, subsample = .5, nthread = 2,
|
||||
objective = "multi:softprob", num_class = nclass, verbose = 0)
|
||||
trees0 <- seq(from=0, by=nclass, length.out=nrounds)
|
||||
col <- rgb(0, 0, 1, 0.5)
|
||||
xgb.plot.shap(x, model = mbst, trees = trees0, target_class = 0, top_n = 4, n_col = 2, col = col, pch = 16, pch_NA = 17)
|
||||
xgb.plot.shap(x, model = mbst, trees = trees0 + 1, target_class = 1, top_n = 4, n_col = 2, col = col, pch = 16, pch_NA = 17)
|
||||
xgb.plot.shap(x, model = mbst, trees = trees0 + 2, target_class = 2, top_n = 4, n_col = 2, col = col, pch = 16, pch_NA = 17)
|
||||
|
||||
}
|
||||
\references{
|
||||
Scott M. Lundberg, Su-In Lee, "A Unified Approach to Interpreting Model Predictions", NIPS Proceedings 2017, \url{https://arxiv.org/abs/1705.07874}
|
||||
|
||||
Scott M. Lundberg, Su-In Lee, "Consistent feature attribution for tree ensembles", \url{https://arxiv.org/abs/1706.06060}
|
||||
}
|
||||
@@ -258,6 +258,10 @@ bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label,
|
||||
objective = "binary:logistic")
|
||||
pred <- predict(bst, agaricus.test$data)
|
||||
|
||||
}
|
||||
\references{
|
||||
Tianqi Chen and Carlos Guestrin, "XGBoost: A Scalable Tree Boosting System",
|
||||
22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016, \url{https://arxiv.org/abs/1603.02754}
|
||||
}
|
||||
\seealso{
|
||||
\code{\link{callbacks}},
|
||||
|
||||
Reference in New Issue
Block a user