% Generated by roxygen2: do not edit by hand % Please edit documentation in R/xgb.DMatrix.R \name{xgb.QuantileDMatrix.from_iterator} \alias{xgb.QuantileDMatrix.from_iterator} \title{QuantileDMatrix from External Data} \usage{ xgb.QuantileDMatrix.from_iterator( data_iterator, missing = NA, nthread = NULL, ref = NULL, max_bin = NULL ) } \arguments{ \item{data_iterator}{A data iterator structure as returned by \link{xgb.DataIter}, which includes an environment shared between function calls, and functions to access the data in batches on-demand.} \item{missing}{A float value to represents missing values in data. Note that, while functions like \link{xgb.DMatrix} can take a generic \code{NA} and interpret it correctly for different types like \code{numeric} and \code{integer}, if an \code{NA} value is passed here, it will not be adapted for different input types. For example, in R \code{integer} types, missing values are represented by integer number \code{-2147483648} (since machine 'integer' types do not have an inherent 'NA' value) - hence, if one passes \code{NA}, which is interpreted as a floating-point NaN by 'xgb.ExternalDMatrix' and by 'xgb.QuantileDMatrix.from_iterator', these integer missing values will not be treated as missing. This should not pose any problem for \code{numeric} types, since they do have an inheret NaN value.} \item{nthread}{Number of threads used for creating DMatrix.} \item{ref}{The training dataset that provides quantile information, needed when creating validation/test dataset with \code{xgb.QuantileDMatrix}. Supplying the training DMatrix as a reference means that the same quantisation applied to the training data is applied to the validation/test data} \item{max_bin}{The number of histogram bin, should be consistent with the training parameter \code{max_bin}. This is only supported when constructing a QuantileDMatrix.} } \value{ An 'xgb.DMatrix' object, with subclass 'xgb.QuantileDMatrix'. } \description{ Create an \code{xgb.QuantileDMatrix} object (exact same class as would be returned by calling function \link{xgb.QuantileDMatrix}, with the same advantages and limitations) from external data supplied by an \link{xgb.DataIter} object, potentially passed in batches from a bigger set that might not fit entirely in memory, same way as \link{xgb.ExternalDMatrix}. Note that, while external data will only be loaded through the iterator (thus the full data might not be held entirely in-memory), the quantized representation of the data will get created in-memory, being concatenated from multiple calls to the data iterator. The quantized version is typically lighter than the original data, so there might be cases in which this representation could potentially fit in memory even if the full data doesn't. For more information, see the guide 'Using XGBoost External Memory Version': \url{https://xgboost.readthedocs.io/en/stable/tutorials/external_memory.html} } \seealso{ \link{xgb.DataIter}, \link{xgb.DataBatch}, \link{xgb.ExternalDMatrix}, \link{xgb.QuantileDMatrix} }