[R] Move all DMatrix fields to function arguments (#9862)

This commit is contained in:
david-cortes
2023-12-09 19:45:28 +01:00
committed by GitHub
parent 1094d6015d
commit 562352101d
10 changed files with 236 additions and 68 deletions

View File

@@ -23,14 +23,20 @@ Get information of an xgb.DMatrix object
The \code{name} field can be one of the following:
\itemize{
\item \code{label}: label XGBoost learn from ;
\item \code{weight}: to do a weight rescale ;
\item \code{base_margin}: base margin is the base prediction XGBoost will boost from ;
\item \code{nrow}: number of rows of the \code{xgb.DMatrix}.
\item \code{label}
\item \code{weight}
\item \code{base_margin}
\item \code{label_lower_bound}
\item \code{label_upper_bound}
\item \code{group}
\item \code{feature_type}
\item \code{feature_name}
\item \code{nrow}
}
See the documentation for \link{xgb.DMatrix} for more information about these fields.
\code{group} can be setup by \code{setinfo} but can't be retrieved by \code{getinfo}.
Note that, while 'qid' cannot be retrieved, it's possible to get the equivalent 'group'
for a DMatrix that had 'qid' assigned.
}
\examples{
data(agaricus.train, package='xgboost')

View File

@@ -22,13 +22,15 @@ setinfo(object, ...)
Set information of an xgb.DMatrix object
}
\details{
The \code{name} field can be one of the following:
See the documentation for \link{xgb.DMatrix} for possible fields that can be set
(which correspond to arguments in that function).
\itemize{
\item \code{label}: label XGBoost learn from ;
\item \code{weight}: to do a weight rescale ;
\item \code{base_margin}: base margin is the base prediction XGBoost will boost from ;
\item \code{group}: number of rows in each group (to use with \code{rank:pairwise} objective).
Note that the following fields are allowed in the construction of an \code{xgb.DMatrix}
but \bold{aren't} allowed here:\itemize{
\item data
\item missing
\item silent
\item nthread
}
}
\examples{

View File

@@ -6,11 +6,18 @@
\usage{
xgb.DMatrix(
data,
info = list(),
label = NULL,
weight = NULL,
base_margin = NULL,
missing = NA,
silent = FALSE,
feature_names = colnames(data),
nthread = NULL,
...
group = NULL,
qid = NULL,
label_lower_bound = NULL,
label_upper_bound = NULL,
feature_weights = NULL
)
}
\arguments{
@@ -19,17 +26,35 @@ a \code{dgRMatrix} object,
a \code{dsparseVector} object (only when making predictions from a fitted model, will be
interpreted as a row vector), or a character string representing a filename.}
\item{info}{a named list of additional information to store in the \code{xgb.DMatrix} object.
See \code{\link{setinfo}} for the specific allowed kinds of}
\item{label}{Label of the training data.}
\item{weight}{Weight for each instance.
Note that, for ranking task, weights are per-group. In ranking task, one weight
is assigned to each group (not each data point). This is because we
only care about the relative ordering of data points within each group,
so it doesn't make sense to assign weights to individual data points.}
\item{base_margin}{Base margin used for boosting from existing model.}
\item{missing}{a float value to represents missing values in data (used only when input is a dense matrix).
It is useful when a 0 or some other extreme value represents missing values in data.}
\item{silent}{whether to suppress printing an informational message after loading from a file.}
\item{feature_names}{Set names for features.}
\item{nthread}{Number of threads used for creating DMatrix.}
\item{...}{the \code{info} data could be passed directly as parameters, without creating an \code{info} list.}
\item{group}{Group size for all ranking group.}
\item{qid}{Query ID for data samples, used for ranking.}
\item{label_lower_bound}{Lower bound for survival training.}
\item{label_upper_bound}{Upper bound for survival training.}
\item{feature_weights}{Set feature weights for column sampling.}
}
\description{
Construct xgb.DMatrix object from either a dense matrix, a sparse matrix, or a local file.