[R] Rename ExternalDMatrix -> ExtMemDMatrix. (#10849)

This commit is contained in:
Jiaming Yuan
2024-09-29 05:45:53 +08:00
committed by GitHub
parent 9ee4008654
commit c9f89c4241
10 changed files with 46 additions and 43 deletions

View File

@@ -26,7 +26,7 @@ to pass here. Supported types are:
\itemize{
\item \code{matrix}, with types \code{numeric}, \code{integer}, and \code{logical}. Note that for types
\code{integer} and \code{logical}, missing values might not be automatically recognized as
as such - see the documentation for parameter \code{missing} in \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}
as such - see the documentation for parameter \code{missing} in \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}
for details on this.
\item \code{data.frame}, with the same types as supported by 'xgb.DMatrix' and same
conversions applied to it. See the documentation for parameter \code{data} in
@@ -92,7 +92,7 @@ data and parameters passed here. It does \strong{not} inherit from \code{xgb.DMa
}
\description{
Helper function to supply data in batches of a data iterator when
constructing a DMatrix from external memory through \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}
constructing a DMatrix from external memory through \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}
or through \code{\link[=xgb.QuantileDMatrix.from_iterator]{xgb.QuantileDMatrix.from_iterator()}}.
This function is \strong{only} meant to be called inside of a callback function (which
@@ -104,8 +104,8 @@ The object that results from calling this function directly is \strong{not} like
an \code{xgb.DMatrix} - i.e. cannot be used to train a model, nor to get predictions - only
possible usage is to supply data to an iterator, from which a DMatrix is then constructed.
For more information and for example usage, see the documentation for \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}.
For more information and for example usage, see the documentation for \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}.
}
\seealso{
\code{\link[=xgb.DataIter]{xgb.DataIter()}}, \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}.
\code{\link[=xgb.DataIter]{xgb.DataIter()}}, \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}.
}

View File

@@ -33,7 +33,7 @@ Note that, after resetting the iterator, the batches will be accessed again, so
}
\value{
An \code{xgb.DataIter} object, containing the same inputs supplied here, which can then
be passed to \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}.
be passed to \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}.
}
\description{
Interface to create a custom data iterator in order to construct a DMatrix
@@ -42,11 +42,11 @@ from external memory.
This function is responsible for generating an R object structure containing callback
functions and an environment shared with them.
The output structure from this function is then meant to be passed to \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}},
The output structure from this function is then meant to be passed to \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}},
which will consume the data and create a DMatrix from it by executing the callback functions.
For more information, and for a usage example, see the documentation for \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}.
For more information, and for a usage example, see the documentation for \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}.
}
\seealso{
\code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}, \code{\link[=xgb.DataBatch]{xgb.DataBatch()}}.
\code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}, \code{\link[=xgb.DataBatch]{xgb.DataBatch()}}.
}

View File

@@ -1,10 +1,10 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/xgb.DMatrix.R
\name{xgb.ExternalDMatrix}
\alias{xgb.ExternalDMatrix}
\name{xgb.ExtMemDMatrix}
\alias{xgb.ExtMemDMatrix}
\title{DMatrix from External Data}
\usage{
xgb.ExternalDMatrix(
xgb.ExtMemDMatrix(
data_iterator,
cache_prefix = tempdir(),
missing = NA,
@@ -26,14 +26,14 @@ it will not be adapted for different input types.
For example, in R \code{integer} types, missing values are represented by integer number \code{-2147483648}
(since machine 'integer' types do not have an inherent 'NA' value) - hence, if one passes \code{NA},
which is interpreted as a floating-point NaN by \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}} and by
which is interpreted as a floating-point NaN by \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}} and by
\code{\link[=xgb.QuantileDMatrix.from_iterator]{xgb.QuantileDMatrix.from_iterator()}}, these integer missing values will not be treated as missing.
This should not pose any problem for \code{numeric} types, since they do have an inheret NaN value.}
\item{nthread}{Number of threads used for creating DMatrix.}
}
\value{
An 'xgb.DMatrix' object, with subclass 'xgb.ExternalDMatrix', in which the data is not
An 'xgb.DMatrix' object, with subclass 'xgb.ExtMemDMatrix', in which the data is not
held internally but accessed through the iterator when needed.
}
\description{
@@ -105,7 +105,7 @@ data_iterator <- xgb.DataIter(
cache_prefix <- tempdir()
# DMatrix will be constructed from the iterator's batches
dm <- xgb.ExternalDMatrix(data_iterator, cache_prefix, nthread = 1)
dm <- xgb.ExtMemDMatrix(data_iterator, cache_prefix, nthread = 1)
# After construction, can be used as a regular DMatrix
params <- list(nthread = 1, objective = "reg:squarederror")

View File

@@ -25,7 +25,7 @@ it will not be adapted for different input types.
For example, in R \code{integer} types, missing values are represented by integer number \code{-2147483648}
(since machine 'integer' types do not have an inherent 'NA' value) - hence, if one passes \code{NA},
which is interpreted as a floating-point NaN by \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}} and by
which is interpreted as a floating-point NaN by \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}} and by
\code{\link[=xgb.QuantileDMatrix.from_iterator]{xgb.QuantileDMatrix.from_iterator()}}, these integer missing values will not be treated as missing.
This should not pose any problem for \code{numeric} types, since they do have an inheret NaN value.}
@@ -48,7 +48,7 @@ An 'xgb.DMatrix' object, with subclass 'xgb.QuantileDMatrix'.
Create an \code{xgb.QuantileDMatrix} object (exact same class as would be returned by
calling function \code{\link[=xgb.QuantileDMatrix]{xgb.QuantileDMatrix()}}, with the same advantages and limitations) from
external data supplied by \code{\link[=xgb.DataIter]{xgb.DataIter()}}, potentially passed in batches from
a bigger set that might not fit entirely in memory, same way as \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}}.
a bigger set that might not fit entirely in memory, same way as \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}}.
Note that, while external data will only be loaded through the iterator (thus the full data
might not be held entirely in-memory), the quantized representation of the data will get
@@ -60,6 +60,6 @@ For more information, see the guide 'Using XGBoost External Memory Version':
\url{https://xgboost.readthedocs.io/en/stable/tutorials/external_memory.html}
}
\seealso{
\code{\link[=xgb.DataIter]{xgb.DataIter()}}, \code{\link[=xgb.DataBatch]{xgb.DataBatch()}}, \code{\link[=xgb.ExternalDMatrix]{xgb.ExternalDMatrix()}},
\code{\link[=xgb.DataIter]{xgb.DataIter()}}, \code{\link[=xgb.DataBatch]{xgb.DataBatch()}}, \code{\link[=xgb.ExtMemDMatrix]{xgb.ExtMemDMatrix()}},
\code{\link[=xgb.QuantileDMatrix]{xgb.QuantileDMatrix()}}
}

View File

@@ -53,7 +53,7 @@ system - thus, for reproducible results, one needs to call the \code{\link[=set.
for model training by the objective.
Note that only the basic \code{xgb.DMatrix} class is supported - variants such as \code{xgb.QuantileDMatrix}
or \code{xgb.ExternalDMatrix} are not supported here.}
or \code{xgb.ExtMemDMatrix} are not supported here.}
\item{nrounds}{The max number of iterations.}