% Generated by roxygen2: do not edit by hand % Please edit documentation in R/xgb.ggplot.R, R/xgb.plot.deepness.R \name{xgb.ggplot.deepness} \alias{xgb.ggplot.deepness} \alias{xgb.plot.deepness} \title{Plot model tree depth} \usage{ xgb.ggplot.deepness( model = NULL, which = c("2x1", "max.depth", "med.depth", "med.weight") ) xgb.plot.deepness( model = NULL, which = c("2x1", "max.depth", "med.depth", "med.weight"), plot = TRUE, ... ) } \arguments{ \item{model}{Either an \code{xgb.Booster} model, or the "data.table" returned by \code{\link[=xgb.model.dt.tree]{xgb.model.dt.tree()}}.} \item{which}{Which distribution to plot (see details).} \item{plot}{Should the plot be shown? Default is \code{TRUE}.} \item{...}{Other parameters passed to \code{\link[graphics:barplot]{graphics::barplot()}} or \code{\link[graphics:plot.default]{graphics::plot()}}.} } \value{ The return value of the two functions is as follows: \itemize{ \item \code{xgb.plot.deepness()}: A "data.table" (invisibly). Each row corresponds to a terminal leaf in the model. It contains its information about depth, cover, and weight (used in calculating predictions). If \code{plot = TRUE}, also a plot is shown. \item \code{xgb.ggplot.deepness()}: When \code{which = "2x1"}, a list of two "ggplot" objects, and a single "ggplot" object otherwise. } } \description{ Visualizes distributions related to the depth of tree leaves. \itemize{ \item \code{xgb.plot.deepness()} uses base R graphics, while \item \code{xgb.ggplot.deepness()} uses "ggplot2". } } \details{ When \code{which = "2x1"}, two distributions with respect to the leaf depth are plotted on top of each other: \enumerate{ \item The distribution of the number of leaves in a tree model at a certain depth. \item The distribution of the average weighted number of observations ("cover") ending up in leaves at a certain depth. } Those could be helpful in determining sensible ranges of the \code{max_depth} and \code{min_child_weight} parameters. When \code{which = "max.depth"} or \code{which = "med.depth"}, plots of either maximum or median depth per tree with respect to the tree number are created. Finally, \code{which = "med.weight"} allows to see how a tree's median absolute leaf weight changes through the iterations. These functions have been inspired by the blog post \url{https://github.com/aysent/random-forest-leaf-visualization}. } \examples{ data(agaricus.train, package = "xgboost") ## Keep the number of threads to 2 for examples nthread <- 2 data.table::setDTthreads(nthread) ## Change max_depth to a higher number to get a more significant result bst <- xgboost( data = agaricus.train$data, label = agaricus.train$label, max_depth = 6, nthread = nthread, nrounds = 50, objective = "binary:logistic", subsample = 0.5, min_child_weight = 2 ) xgb.plot.deepness(bst) xgb.ggplot.deepness(bst) xgb.plot.deepness( bst, which = "max.depth", pch = 16, col = rgb(0, 0, 1, 0.3), cex = 2 ) xgb.plot.deepness( bst, which = "med.weight", pch = 16, col = rgb(0, 0, 1, 0.3), cex = 2 ) } \seealso{ \code{\link[=xgb.train]{xgb.train()}} and \code{\link[=xgb.model.dt.tree]{xgb.model.dt.tree()}}. }