% Generated by roxygen2: do not edit by hand % Please edit documentation in R/xgb.DMatrix.R \name{xgb.DataBatch} \alias{xgb.DataBatch} \title{Structure for Data Batches} \usage{ xgb.DataBatch( data, label = NULL, weight = NULL, base_margin = NULL, feature_names = colnames(data), feature_types = NULL, group = NULL, qid = NULL, label_lower_bound = NULL, label_upper_bound = NULL, feature_weights = NULL ) } \arguments{ \item{data}{Batch of data belonging to this batch. Note that not all of the input types supported by \link{xgb.DMatrix} are possible to pass here. Supported types are:\itemize{ \item \code{matrix}, with types \code{numeric}, \code{integer}, and \code{logical}. Note that for types \code{integer} and \code{logical}, missing values might not be automatically recognized as as such - see the documentation for parameter \code{missing} in \link{xgb.ExternalDMatrix} for details on this. \item \code{data.frame}, with the same types as supported by 'xgb.DMatrix' and same conversions applied to it. See the documentation for parameter \code{data} in \link{xgb.DMatrix} for details on it. \item CSR matrices, as class \code{dgRMatrix} from package \code{Matrix}. }} \item{label}{Label of the training data. For classification problems, should be passed encoded as integers with numeration starting at zero.} \item{weight}{Weight for each instance. Note that, for ranking task, weights are per-group. In ranking task, one weight is assigned to each group (not each data point). This is because we only care about the relative ordering of data points within each group, so it doesn't make sense to assign weights to individual data points.} \item{base_margin}{Base margin used for boosting from existing model. \if{html}{\out{
}}\preformatted{ In the case of multi-output models, one can also pass multi-dimensional base_margin. }\if{html}{\out{
}}} \item{feature_names}{Set names for features. Overrides column names in data frame and matrix. \if{html}{\out{
}}\preformatted{ Note: columns are not referenced by name when calling `predict`, so the column order there must be the same as in the DMatrix construction, regardless of the column names. }\if{html}{\out{
}}} \item{feature_types}{Set types for features. If \code{data} is a \code{data.frame} and passing \code{feature_types} is not supplied, feature types will be deduced automatically from the column types. Otherwise, one can pass a character vector with the same length as number of columns in \code{data}, with the following possible values:\itemize{ \item "c", which represents categorical columns. \item "q", which represents numeric columns. \item "int", which represents integer columns. \item "i", which represents logical (boolean) columns. } Note that, while categorical types are treated differently from the rest for model fitting purposes, the other types do not influence the generated model, but have effects in other functionalities such as feature importances. \bold{Important}: categorical features, if specified manually through \code{feature_types}, must be encoded as integers with numeration starting at zero, and the same encoding needs to be applied when passing data to \code{predict}. Even if passing \code{factor} types, the encoding will not be saved, so make sure that \code{factor} columns passed to \code{predict} have the same \code{levels}.} \item{group}{Group size for all ranking group.} \item{qid}{Query ID for data samples, used for ranking.} \item{label_lower_bound}{Lower bound for survival training.} \item{label_upper_bound}{Upper bound for survival training.} \item{feature_weights}{Set feature weights for column sampling.} } \value{ An object of class \code{xgb.DataBatch}, which is just a list containing the data and parameters passed here. It does \bold{not} inherit from \code{xgb.DMatrix}. } \description{ Helper function to supply data in batches of a data iterator when constructing a DMatrix from external memory through \link{xgb.ExternalDMatrix} or through \link{xgb.QuantileDMatrix.from_iterator}. This function is \bold{only} meant to be called inside of a callback function (which is passed as argument to function \link{xgb.DataIter} to construct a data iterator) when constructing a DMatrix through external memory - otherwise, one should call \link{xgb.DMatrix} or \link{xgb.QuantileDMatrix}. The object that results from calling this function directly is \bold{not} like an \code{xgb.DMatrix} - i.e. cannot be used to train a model, nor to get predictions - only possible usage is to supply data to an iterator, from which a DMatrix is then constructed. For more information and for example usage, see the documentation for \link{xgb.ExternalDMatrix}. } \seealso{ \link{xgb.DataIter}, \link{xgb.ExternalDMatrix}. }