[Doc] fix typos in documentation (#9458)
This commit is contained in:
@@ -55,7 +55,7 @@ To ensure that CMake can locate the XGBoost library, supply ``-DCMAKE_PREFIX_PAT
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Nagivate to the build directory for your application
|
||||
# Navigate to the build directory for your application
|
||||
cd build
|
||||
# Activate the Conda environment where we previously installed XGBoost
|
||||
conda activate [env_name]
|
||||
@@ -65,7 +65,7 @@ To ensure that CMake can locate the XGBoost library, supply ``-DCMAKE_PREFIX_PAT
|
||||
make
|
||||
|
||||
************************
|
||||
Usefull Tips To Remember
|
||||
Useful Tips To Remember
|
||||
************************
|
||||
|
||||
Below are some useful tips while using C API:
|
||||
@@ -151,7 +151,7 @@ c. Assertion technique: It works both in C/ C++. If expression evaluates to 0 (f
|
||||
Example if we our training data is in ``dense matrix`` format then your prediction dataset should also be a ``dense matrix`` or if training in ``libsvm`` format then dataset for prediction should also be in ``libsvm`` format.
|
||||
|
||||
|
||||
4. Always use strings for setting values to the parameters in booster handle object. The paramter value can be of any data type (e.g. int, char, float, double, etc), but they should always be encoded as strings.
|
||||
4. Always use strings for setting values to the parameters in booster handle object. The parameter value can be of any data type (e.g. int, char, float, double, etc), but they should always be encoded as strings.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
@@ -168,7 +168,7 @@ Sample examples along with Code snippet to use C API functions
|
||||
.. code-block:: c
|
||||
|
||||
DMatrixHandle data; // handle to DMatrix
|
||||
// Load the dat from file & store it in data variable of DMatrixHandle datatype
|
||||
// Load the data from file & store it in data variable of DMatrixHandle datatype
|
||||
safe_xgboost(XGDMatrixCreateFromFile("/path/to/file/filename", silent, &data));
|
||||
|
||||
|
||||
@@ -278,7 +278,7 @@ Sample examples along with Code snippet to use C API functions
|
||||
uint64_t const* out_shape;
|
||||
/* Dimension of output prediction */
|
||||
uint64_t out_dim;
|
||||
/* Pointer to a thread local contigious array, assigned in prediction function. */
|
||||
/* Pointer to a thread local contiguous array, assigned in prediction function. */
|
||||
float const* out_result = NULL;
|
||||
safe_xgboost(
|
||||
XGBoosterPredictFromDMatrix(booster, dmatrix, config, &out_shape, &out_dim, &out_result));
|
||||
|
||||
@@ -38,7 +38,7 @@ Although XGBoost has native support for said functions, using it for demonstrati
|
||||
provides us the opportunity of comparing the result from our own implementation and the
|
||||
one from XGBoost internal for learning purposes. After finishing this tutorial, we should
|
||||
be able to provide our own functions for rapid experiments. And at the end, we will
|
||||
provide some notes on non-identy link function along with examples of using custom metric
|
||||
provide some notes on non-identity link function along with examples of using custom metric
|
||||
and objective with the `scikit-learn` interface.
|
||||
|
||||
If we compute the gradient of said objective function:
|
||||
@@ -165,7 +165,7 @@ Reverse Link Function
|
||||
When using builtin objective, the raw prediction is transformed according to the objective
|
||||
function. When a custom objective is provided XGBoost doesn't know its link function so the
|
||||
user is responsible for making the transformation for both objective and custom evaluation
|
||||
metric. For objective with identiy link like ``squared error`` this is trivial, but for
|
||||
metric. For objective with identity link like ``squared error`` this is trivial, but for
|
||||
other link functions like log link or inverse link the difference is significant.
|
||||
|
||||
For the Python package, the behaviour of prediction can be controlled by the
|
||||
@@ -173,7 +173,7 @@ For the Python package, the behaviour of prediction can be controlled by the
|
||||
parameter without a custom objective, the metric function will receive transformed
|
||||
prediction since the objective is defined by XGBoost. However, when the custom objective is
|
||||
also provided along with that metric, then both the objective and custom metric will
|
||||
recieve raw prediction. The following example provides a comparison between two different
|
||||
receive raw prediction. The following example provides a comparison between two different
|
||||
behavior with a multi-class classification model. Firstly we define 2 different Python
|
||||
metric functions implementing the same underlying metric for comparison,
|
||||
`merror_with_transform` is used when custom objective is also used, otherwise the simpler
|
||||
|
||||
@@ -256,7 +256,7 @@ In the example below, a ``KubeCluster`` is used for `deploying Dask on Kubernete
|
||||
m = 1000
|
||||
n = 10
|
||||
kWorkers = 2 # assuming you have 2 GPU nodes on that cluster.
|
||||
# You need to work out the worker-spec youself. See document in dask_kubernetes for
|
||||
# You need to work out the worker-spec yourself. See document in dask_kubernetes for
|
||||
# its usage. Here we just want to show that XGBoost works on various clusters.
|
||||
cluster = KubeCluster.from_yaml('worker-spec.yaml', deploy_mode='remote')
|
||||
cluster.scale(kWorkers) # scale to use all GPUs
|
||||
@@ -648,7 +648,7 @@ environment than training the model using a single node due to aforementioned cr
|
||||
Memory Usage
|
||||
************
|
||||
|
||||
Here are some pratices on reducing memory usage with dask and xgboost.
|
||||
Here are some practices on reducing memory usage with dask and xgboost.
|
||||
|
||||
- In a distributed work flow, data is best loaded by dask collections directly instead of
|
||||
loaded by client process. When loading with client process is unavoidable, use
|
||||
|
||||
@@ -7,7 +7,7 @@ dataset needs to be loaded into memory. This can be costly and sometimes
|
||||
infeasible. Staring from 1.5, users can define a custom iterator to load data in chunks
|
||||
for running XGBoost algorithms. External memory can be used for both training and
|
||||
prediction, but training is the primary use case and it will be our focus in this
|
||||
tutorial. For prediction and evaluation, users can iterate through the data themseleves
|
||||
tutorial. For prediction and evaluation, users can iterate through the data themselves
|
||||
while training requires the full dataset to be loaded into the memory.
|
||||
|
||||
During training, there are two different modes for external memory support available in
|
||||
@@ -142,7 +142,7 @@ see `this paper <https://arxiv.org/abs/2005.09148>`_.
|
||||
.. warning::
|
||||
|
||||
When GPU is running out of memory during iteration on external memory, user might
|
||||
recieve a segfault instead of an OOM exception.
|
||||
receive a segfault instead of an OOM exception.
|
||||
|
||||
.. _ext_remarks:
|
||||
|
||||
@@ -150,7 +150,7 @@ see `this paper <https://arxiv.org/abs/2005.09148>`_.
|
||||
Remarks
|
||||
*******
|
||||
|
||||
When using external memory with XBGoost, data is divided into smaller chunks so that only
|
||||
When using external memory with XGBoost, data is divided into smaller chunks so that only
|
||||
a fraction of it needs to be stored in memory at any given time. It's important to note
|
||||
that this method only applies to the predictor data (``X``), while other data, like labels
|
||||
and internal runtime structures are concatenated. This means that memory reduction is most
|
||||
@@ -211,7 +211,7 @@ construction of `QuantileDmatrix` with data chunks. On the other hand, if it's p
|
||||
doesn't fetch data during training. On the other hand, the external memory `DMatrix`
|
||||
fetches data batches from external memory on-demand. Use the `QuantileDMatrix` (with
|
||||
iterator if necessary) when you can fit most of your data in memory. The training would be
|
||||
an order of magnitute faster than using external memory.
|
||||
an order of magnitude faster than using external memory.
|
||||
|
||||
****************
|
||||
Text File Inputs
|
||||
|
||||
@@ -233,7 +233,7 @@ This has lead to some interesting implications of feature interaction constraint
|
||||
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
|
||||
features in our training datasets for presentation purpose, careful readers might have
|
||||
found out that the above constraint is the same as simply ``[[0, 1, 2]]``. Since no matter which
|
||||
feature is chosen for split in the root node, all its descendants are allowd to include every
|
||||
feature is chosen for split in the root node, all its descendants are allowed to include every
|
||||
feature as legitimate split candidates without violating interaction constraints.
|
||||
|
||||
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
|
||||
|
||||
@@ -11,12 +11,12 @@ Learning to Rank
|
||||
********
|
||||
Overview
|
||||
********
|
||||
Often in the context of information retrieval, learning-to-rank aims to train a model that arranges a set of query results into an ordered list `[1] <#references>`__. For surprivised learning-to-rank, the predictors are sample documents encoded as feature matrix, and the labels are relevance degree for each sample. Relevance degree can be multi-level (graded) or binary (relevant or not). The training samples are often grouped by their query index with each query group containing multiple query results.
|
||||
Often in the context of information retrieval, learning-to-rank aims to train a model that arranges a set of query results into an ordered list `[1] <#references>`__. For supervised learning-to-rank, the predictors are sample documents encoded as feature matrix, and the labels are relevance degree for each sample. Relevance degree can be multi-level (graded) or binary (relevant or not). The training samples are often grouped by their query index with each query group containing multiple query results.
|
||||
|
||||
XGBoost implements learning to rank through a set of objective functions and performance metrics. The default objective is ``rank:ndcg`` based on the ``LambdaMART`` `[2] <#references>`__ algorithm, which in turn is an adaptation of the ``LambdaRank`` `[3] <#references>`__ framework to gradient boosting trees. For a history and a summary of the algorithm, see `[5] <#references>`__. The implementation in XGBoost features deterministic GPU computation, distributed training, position debiasing and two different pair construction strategies.
|
||||
|
||||
************************************
|
||||
Training with the Pariwise Objective
|
||||
Training with the Pairwise Objective
|
||||
************************************
|
||||
``LambdaMART`` is a pairwise ranking model, meaning that it compares the relevance degree for every pair of samples in a query group and calculate a proxy gradient for each pair. The default objective ``rank:ndcg`` is using the surrogate gradient derived from the ``ndcg`` metric. To train a XGBoost model, we need an additional sorted array called ``qid`` for specifying the query group of input samples. An example input would look like this:
|
||||
|
||||
@@ -59,7 +59,7 @@ Notice that the samples are sorted based on their query index in a non-decreasin
|
||||
X = X[sorted_idx, :]
|
||||
y = y[sorted_idx]
|
||||
|
||||
The simpliest way to train a ranking model is by using the scikit-learn estimator interface. Continuing the previous snippet, we can train a simple ranking model without tuning:
|
||||
The simplest way to train a ranking model is by using the scikit-learn estimator interface. Continuing the previous snippet, we can train a simple ranking model without tuning:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
||||
@@ -138,7 +138,7 @@ This will train on four GPUs in parallel.
|
||||
|
||||
Note that it usually does not make sense to allocate more than one GPU per actor,
|
||||
as XGBoost relies on distributed libraries such as Dask or Ray to utilize multi
|
||||
GPU taining.
|
||||
GPU training.
|
||||
|
||||
Setting the number of CPUs per actor
|
||||
====================================
|
||||
|
||||
Reference in New Issue
Block a user