xgboost/R-package/man/xgb.plot.deepness.Rd
2024-08-20 13:33:13 +08:00

102 lines
3.1 KiB
R

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/xgb.ggplot.R, R/xgb.plot.deepness.R
\name{xgb.ggplot.deepness}
\alias{xgb.ggplot.deepness}
\alias{xgb.plot.deepness}
\title{Plot model tree depth}
\usage{
xgb.ggplot.deepness(
model = NULL,
which = c("2x1", "max.depth", "med.depth", "med.weight")
)
xgb.plot.deepness(
model = NULL,
which = c("2x1", "max.depth", "med.depth", "med.weight"),
plot = TRUE,
...
)
}
\arguments{
\item{model}{Either an \code{xgb.Booster} model, or the "data.table" returned
by \code{\link[=xgb.model.dt.tree]{xgb.model.dt.tree()}}.}
\item{which}{Which distribution to plot (see details).}
\item{plot}{Should the plot be shown? Default is \code{TRUE}.}
\item{...}{Other parameters passed to \code{\link[graphics:barplot]{graphics::barplot()}} or \code{\link[graphics:plot.default]{graphics::plot()}}.}
}
\value{
The return value of the two functions is as follows:
\itemize{
\item \code{xgb.plot.deepness()}: A "data.table" (invisibly).
Each row corresponds to a terminal leaf in the model. It contains its information
about depth, cover, and weight (used in calculating predictions).
If \code{plot = TRUE}, also a plot is shown.
\item \code{xgb.ggplot.deepness()}: When \code{which = "2x1"}, a list of two "ggplot" objects,
and a single "ggplot" object otherwise.
}
}
\description{
Visualizes distributions related to the depth of tree leaves.
\itemize{
\item \code{xgb.plot.deepness()} uses base R graphics, while
\item \code{xgb.ggplot.deepness()} uses "ggplot2".
}
}
\details{
When \code{which = "2x1"}, two distributions with respect to the leaf depth
are plotted on top of each other:
\enumerate{
\item The distribution of the number of leaves in a tree model at a certain depth.
\item The distribution of the average weighted number of observations ("cover")
ending up in leaves at a certain depth.
}
Those could be helpful in determining sensible ranges of the \code{max_depth}
and \code{min_child_weight} parameters.
When \code{which = "max.depth"} or \code{which = "med.depth"}, plots of either maximum or
median depth per tree with respect to the tree number are created.
Finally, \code{which = "med.weight"} allows to see how
a tree's median absolute leaf weight changes through the iterations.
These functions have been inspired by the blog post
\url{https://github.com/aysent/random-forest-leaf-visualization}.
}
\examples{
data(agaricus.train, package = "xgboost")
## Keep the number of threads to 2 for examples
nthread <- 2
data.table::setDTthreads(nthread)
## Change max_depth to a higher number to get a more significant result
bst <- xgb.train(
data = xgb.DMatrix(agaricus.train$data, label = agaricus.train$label),
max_depth = 6,
nthread = nthread,
nrounds = 50,
objective = "binary:logistic",
subsample = 0.5,
min_child_weight = 2
)
xgb.plot.deepness(bst)
xgb.ggplot.deepness(bst)
xgb.plot.deepness(
bst, which = "max.depth", pch = 16, col = rgb(0, 0, 1, 0.3), cex = 2
)
xgb.plot.deepness(
bst, which = "med.weight", pch = 16, col = rgb(0, 0, 1, 0.3), cex = 2
)
}
\seealso{
\code{\link[=xgb.train]{xgb.train()}} and \code{\link[=xgb.model.dt.tree]{xgb.model.dt.tree()}}.
}