[R] Document handling of indexes (#10019)

---------

Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
This commit is contained in:
david-cortes
2024-02-01 22:39:09 +01:00
committed by GitHub
parent 4dfbe2a893
commit 662854c7d7
5 changed files with 61 additions and 5 deletions

View File

@@ -33,7 +33,8 @@
#' \item Binary files generated by \link{xgb.DMatrix.save}, passed as a path to the file. These are
#' \bold{not} supported for xgb.QuantileDMatrix'.
#' }
#' @param label Label of the training data.
#' @param label Label of the training data. For classification problems, should be passed encoded as
#' integers with numeration starting at zero.
#' @param weight Weight for each instance.
#'
#' Note that, for ranking task, weights are per-group. In ranking task, one weight
@@ -69,6 +70,11 @@
#' Note that, while categorical types are treated differently from the rest for model fitting
#' purposes, the other types do not influence the generated model, but have effects in other
#' functionalities such as feature importances.
#'
#' \bold{Important}: categorical features, if specified manually through `feature_types`, must
#' be encoded as integers with numeration starting at zero, and the same encoding needs to be
#' applied when passing data to `predict`. Even if passing `factor` types, the encoding will
#' not be saved, so make sure that `factor` columns passed to `predict` have the same `levels`.
#' @param nthread Number of threads used for creating DMatrix.
#' @param group Group size for all ranking group.
#' @param qid Query ID for data samples, used for ranking.

View File

@@ -66,7 +66,8 @@ supported for xgb.QuantileDMatrix'.
\bold{not} supported for xgb.QuantileDMatrix'.
}}
\item{label}{Label of the training data.}
\item{label}{Label of the training data. For classification problems, should be passed encoded as
integers with numeration starting at zero.}
\item{weight}{Weight for each instance.
@@ -109,7 +110,12 @@ with the following possible values:\itemize{
Note that, while categorical types are treated differently from the rest for model fitting
purposes, the other types do not influence the generated model, but have effects in other
functionalities such as feature importances.}
functionalities such as feature importances.
\bold{Important}: categorical features, if specified manually through \code{feature_types}, must
be encoded as integers with numeration starting at zero, and the same encoding needs to be
applied when passing data to \code{predict}. Even if passing \code{factor} types, the encoding will
not be saved, so make sure that \code{factor} columns passed to \code{predict} have the same \code{levels}.}
\item{nthread}{Number of threads used for creating DMatrix.}

View File

@@ -33,7 +33,8 @@ conversions applied to it. See the documentation for parameter \code{data} in
\item CSR matrices, as class \code{dgRMatrix} from package \code{Matrix}.
}}
\item{label}{Label of the training data.}
\item{label}{Label of the training data. For classification problems, should be passed encoded as
integers with numeration starting at zero.}
\item{weight}{Weight for each instance.
@@ -69,7 +70,12 @@ with the following possible values:\itemize{
Note that, while categorical types are treated differently from the rest for model fitting
purposes, the other types do not influence the generated model, but have effects in other
functionalities such as feature importances.}
functionalities such as feature importances.
\bold{Important}: categorical features, if specified manually through \code{feature_types}, must
be encoded as integers with numeration starting at zero, and the same encoding needs to be
applied when passing data to \code{predict}. Even if passing \code{factor} types, the encoding will
not be saved, so make sure that \code{factor} columns passed to \code{predict} have the same \code{levels}.}
\item{group}{Group size for all ranking group.}