[Doc] fix typos in documentation (#9458)
This commit is contained in:
parent
4359356d46
commit
9dbb71490c
1
.gitignore
vendored
1
.gitignore
vendored
@ -48,6 +48,7 @@ Debug
|
||||
*.Rproj
|
||||
./xgboost.mpi
|
||||
./xgboost.mock
|
||||
*.bak
|
||||
#.Rbuildignore
|
||||
R-package.Rproj
|
||||
*.cache*
|
||||
|
||||
@ -119,7 +119,7 @@ An up-to-date version of the CUDA toolkit is required.
|
||||
|
||||
.. note:: Checking your compiler version
|
||||
|
||||
CUDA is really picky about supported compilers, a table for the compatible compilers for the latests CUDA version on Linux can be seen `here <https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html>`_.
|
||||
CUDA is really picky about supported compilers, a table for the compatible compilers for the latest CUDA version on Linux can be seen `here <https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html>`_.
|
||||
|
||||
Some distros package a compatible ``gcc`` version with CUDA. If you run into compiler errors with ``nvcc``, try specifying the correct compiler with ``-DCMAKE_CXX_COMPILER=/path/to/correct/g++ -DCMAKE_C_COMPILER=/path/to/correct/gcc``. On Arch Linux, for example, both binaries can be found under ``/opt/cuda/bin/``.
|
||||
|
||||
|
||||
@ -32,7 +32,7 @@ GitHub Actions is also used to build Python wheels targeting MacOS Intel and App
|
||||
``python_wheels`` pipeline sets up environment variables prefixed ``CIBW_*`` to indicate the target
|
||||
OS and processor. The pipeline then invokes the script ``build_python_wheels.sh``, which in turns
|
||||
calls ``cibuildwheel`` to build the wheel. The ``cibuildwheel`` is a library that sets up a
|
||||
suitable Python environment for each OS and processor target. Since we don't have Apple Silion
|
||||
suitable Python environment for each OS and processor target. Since we don't have Apple Silicon
|
||||
machine in GitHub Actions, cross-compilation is needed; ``cibuildwheel`` takes care of the complex
|
||||
task of cross-compiling a Python wheel. (Note that ``cibuildwheel`` will call
|
||||
``pip wheel``. Since XGBoost has a native library component, we created a customized build
|
||||
@ -131,7 +131,7 @@ set up a credential pair in order to provision resources on AWS. See
|
||||
Worker Image Pipeline
|
||||
=====================
|
||||
Building images for worker machines used to be a chore: you'd provision an EC2 machine, SSH into it, and
|
||||
manually install the necessary packages. This process is not only laborous but also error-prone. You may
|
||||
manually install the necessary packages. This process is not only laborious but also error-prone. You may
|
||||
forget to install a package or change a system configuration.
|
||||
|
||||
No more. Now we have an automated pipeline for building images for worker machines.
|
||||
|
||||
@ -100,7 +100,7 @@ two automatic checks to enforce coding style conventions. To expedite the code r
|
||||
|
||||
Linter
|
||||
======
|
||||
We use `pylint <https://github.com/PyCQA/pylint>`_ and `cpplint <https://github.com/cpplint/cpplint>`_ to enforce style convention and find potential errors. Linting is especially useful for Python, as we can catch many errors that would have otherwise occured at run-time.
|
||||
We use `pylint <https://github.com/PyCQA/pylint>`_ and `cpplint <https://github.com/cpplint/cpplint>`_ to enforce style convention and find potential errors. Linting is especially useful for Python, as we can catch many errors that would have otherwise occurred at run-time.
|
||||
|
||||
To run this check locally, run the following command from the top level source tree:
|
||||
|
||||
|
||||
@ -29,7 +29,7 @@ The Project Management Committee (PMC) of the XGBoost project appointed `Open So
|
||||
|
||||
All expenses incurred for hosting CI will be submitted to the fiscal host with receipts. Only the expenses in the following categories will be approved for reimbursement:
|
||||
|
||||
* Cloud exprenses for the cloud test farm (https://buildkite.com/xgboost)
|
||||
* Cloud expenses for the cloud test farm (https://buildkite.com/xgboost)
|
||||
* Cost of domain https://xgboost-ci.net
|
||||
* Monthly cost of using BuildKite
|
||||
* Hosting cost of the User Forum (https://discuss.xgboost.ai)
|
||||
|
||||
@ -169,7 +169,7 @@ supply a specified SANITIZER_PATH.
|
||||
|
||||
How to use sanitizers with CUDA support
|
||||
=======================================
|
||||
Runing XGBoost on CUDA with address sanitizer (asan) will raise memory error.
|
||||
Running XGBoost on CUDA with address sanitizer (asan) will raise memory error.
|
||||
To use asan with CUDA correctly, you need to configure asan via ASAN_OPTIONS
|
||||
environment variable:
|
||||
|
||||
|
||||
@ -63,7 +63,7 @@ XGBoost supports missing values by default.
|
||||
In tree algorithms, branch directions for missing values are learned during training.
|
||||
Note that the gblinear booster treats missing values as zeros.
|
||||
|
||||
When the ``missing`` parameter is specifed, values in the input predictor that is equal to
|
||||
When the ``missing`` parameter is specified, values in the input predictor that is equal to
|
||||
``missing`` will be treated as missing and removed. By default it's set to ``NaN``.
|
||||
|
||||
**************************************
|
||||
|
||||
@ -129,7 +129,7 @@ With parameters and data, you are able to train a booster model.
|
||||
|
||||
booster.saveModel("model.bin");
|
||||
|
||||
* Generaing model dump with feature map
|
||||
* Generating model dump with feature map
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
|
||||
@ -54,7 +54,7 @@ After 1.4 release, we added a new parameter called ``strict_shape``, one can set
|
||||
Output is a 4-dim array with ``(n_samples, n_iterations, n_classes, n_trees_in_forest)``
|
||||
as shape. ``n_trees_in_forest`` is specified by the ``numb_parallel_tree`` during
|
||||
training. When strict shape is set to False, output is a 2-dim array with last 3 dims
|
||||
concatenated into 1. Also the last dimension is dropped if it eqauls to 1. When using
|
||||
concatenated into 1. Also the last dimension is dropped if it equals to 1. When using
|
||||
``apply`` method in scikit learn interface, this is set to False by default.
|
||||
|
||||
|
||||
@ -68,7 +68,7 @@ n_classes, n_trees_in_forest)``, while R with ``strict_shape=TRUE`` outputs
|
||||
Other than these prediction types, there's also a parameter called ``iteration_range``,
|
||||
which is similar to model slicing. But instead of actually splitting up the model into
|
||||
multiple stacks, it simply returns the prediction formed by the trees within range.
|
||||
Number of trees created in each iteration eqauls to :math:`trees_i = num\_class \times
|
||||
Number of trees created in each iteration equals to :math:`trees_i = num\_class \times
|
||||
num\_parallel\_tree`. So if you are training a boosted random forest with size of 4, on
|
||||
the 3-class classification dataset, and want to use the first 2 iterations of trees for
|
||||
prediction, you need to provide ``iteration_range=(0, 2)``. Then the first :math:`2
|
||||
|
||||
@ -20,7 +20,7 @@ sklearn estimator interface is still working in progress.
|
||||
|
||||
You can find some some quick start examples at
|
||||
:ref:`sphx_glr_python_examples_sklearn_examples.py`. The main advantage of using sklearn
|
||||
interface is that it works with most of the utilites provided by sklearn like
|
||||
interface is that it works with most of the utilities provided by sklearn like
|
||||
:py:func:`sklearn.model_selection.cross_validate`. Also, many other libraries recognize
|
||||
the sklearn estimator interface thanks to its popularity.
|
||||
|
||||
|
||||
@ -68,7 +68,7 @@ Other Updaters
|
||||
1. ``Prune``: It prunes the existing trees. ``prune`` is usually used as part of other
|
||||
tree methods. To use pruner independently, one needs to set the process type to update
|
||||
by: ``{"process_type": "update", "updater": "prune"}``. With this set of parameters,
|
||||
during trianing, XGBOost will prune the existing trees according to 2 parameters
|
||||
during training, XGBoost will prune the existing trees according to 2 parameters
|
||||
``min_split_loss (gamma)`` and ``max_depth``.
|
||||
|
||||
2. ``Refresh``: Refresh the statistic of built trees on a new training dataset. Like the
|
||||
|
||||
@ -55,7 +55,7 @@ To ensure that CMake can locate the XGBoost library, supply ``-DCMAKE_PREFIX_PAT
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Nagivate to the build directory for your application
|
||||
# Navigate to the build directory for your application
|
||||
cd build
|
||||
# Activate the Conda environment where we previously installed XGBoost
|
||||
conda activate [env_name]
|
||||
@ -65,7 +65,7 @@ To ensure that CMake can locate the XGBoost library, supply ``-DCMAKE_PREFIX_PAT
|
||||
make
|
||||
|
||||
************************
|
||||
Usefull Tips To Remember
|
||||
Useful Tips To Remember
|
||||
************************
|
||||
|
||||
Below are some useful tips while using C API:
|
||||
@ -151,7 +151,7 @@ c. Assertion technique: It works both in C/ C++. If expression evaluates to 0 (f
|
||||
Example if we our training data is in ``dense matrix`` format then your prediction dataset should also be a ``dense matrix`` or if training in ``libsvm`` format then dataset for prediction should also be in ``libsvm`` format.
|
||||
|
||||
|
||||
4. Always use strings for setting values to the parameters in booster handle object. The paramter value can be of any data type (e.g. int, char, float, double, etc), but they should always be encoded as strings.
|
||||
4. Always use strings for setting values to the parameters in booster handle object. The parameter value can be of any data type (e.g. int, char, float, double, etc), but they should always be encoded as strings.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
@ -168,7 +168,7 @@ Sample examples along with Code snippet to use C API functions
|
||||
.. code-block:: c
|
||||
|
||||
DMatrixHandle data; // handle to DMatrix
|
||||
// Load the dat from file & store it in data variable of DMatrixHandle datatype
|
||||
// Load the data from file & store it in data variable of DMatrixHandle datatype
|
||||
safe_xgboost(XGDMatrixCreateFromFile("/path/to/file/filename", silent, &data));
|
||||
|
||||
|
||||
@ -278,7 +278,7 @@ Sample examples along with Code snippet to use C API functions
|
||||
uint64_t const* out_shape;
|
||||
/* Dimension of output prediction */
|
||||
uint64_t out_dim;
|
||||
/* Pointer to a thread local contigious array, assigned in prediction function. */
|
||||
/* Pointer to a thread local contiguous array, assigned in prediction function. */
|
||||
float const* out_result = NULL;
|
||||
safe_xgboost(
|
||||
XGBoosterPredictFromDMatrix(booster, dmatrix, config, &out_shape, &out_dim, &out_result));
|
||||
|
||||
@ -38,7 +38,7 @@ Although XGBoost has native support for said functions, using it for demonstrati
|
||||
provides us the opportunity of comparing the result from our own implementation and the
|
||||
one from XGBoost internal for learning purposes. After finishing this tutorial, we should
|
||||
be able to provide our own functions for rapid experiments. And at the end, we will
|
||||
provide some notes on non-identy link function along with examples of using custom metric
|
||||
provide some notes on non-identity link function along with examples of using custom metric
|
||||
and objective with the `scikit-learn` interface.
|
||||
|
||||
If we compute the gradient of said objective function:
|
||||
@ -165,7 +165,7 @@ Reverse Link Function
|
||||
When using builtin objective, the raw prediction is transformed according to the objective
|
||||
function. When a custom objective is provided XGBoost doesn't know its link function so the
|
||||
user is responsible for making the transformation for both objective and custom evaluation
|
||||
metric. For objective with identiy link like ``squared error`` this is trivial, but for
|
||||
metric. For objective with identity link like ``squared error`` this is trivial, but for
|
||||
other link functions like log link or inverse link the difference is significant.
|
||||
|
||||
For the Python package, the behaviour of prediction can be controlled by the
|
||||
@ -173,7 +173,7 @@ For the Python package, the behaviour of prediction can be controlled by the
|
||||
parameter without a custom objective, the metric function will receive transformed
|
||||
prediction since the objective is defined by XGBoost. However, when the custom objective is
|
||||
also provided along with that metric, then both the objective and custom metric will
|
||||
recieve raw prediction. The following example provides a comparison between two different
|
||||
receive raw prediction. The following example provides a comparison between two different
|
||||
behavior with a multi-class classification model. Firstly we define 2 different Python
|
||||
metric functions implementing the same underlying metric for comparison,
|
||||
`merror_with_transform` is used when custom objective is also used, otherwise the simpler
|
||||
|
||||
@ -256,7 +256,7 @@ In the example below, a ``KubeCluster`` is used for `deploying Dask on Kubernete
|
||||
m = 1000
|
||||
n = 10
|
||||
kWorkers = 2 # assuming you have 2 GPU nodes on that cluster.
|
||||
# You need to work out the worker-spec youself. See document in dask_kubernetes for
|
||||
# You need to work out the worker-spec yourself. See document in dask_kubernetes for
|
||||
# its usage. Here we just want to show that XGBoost works on various clusters.
|
||||
cluster = KubeCluster.from_yaml('worker-spec.yaml', deploy_mode='remote')
|
||||
cluster.scale(kWorkers) # scale to use all GPUs
|
||||
@ -648,7 +648,7 @@ environment than training the model using a single node due to aforementioned cr
|
||||
Memory Usage
|
||||
************
|
||||
|
||||
Here are some pratices on reducing memory usage with dask and xgboost.
|
||||
Here are some practices on reducing memory usage with dask and xgboost.
|
||||
|
||||
- In a distributed work flow, data is best loaded by dask collections directly instead of
|
||||
loaded by client process. When loading with client process is unavoidable, use
|
||||
|
||||
@ -7,7 +7,7 @@ dataset needs to be loaded into memory. This can be costly and sometimes
|
||||
infeasible. Staring from 1.5, users can define a custom iterator to load data in chunks
|
||||
for running XGBoost algorithms. External memory can be used for both training and
|
||||
prediction, but training is the primary use case and it will be our focus in this
|
||||
tutorial. For prediction and evaluation, users can iterate through the data themseleves
|
||||
tutorial. For prediction and evaluation, users can iterate through the data themselves
|
||||
while training requires the full dataset to be loaded into the memory.
|
||||
|
||||
During training, there are two different modes for external memory support available in
|
||||
@ -142,7 +142,7 @@ see `this paper <https://arxiv.org/abs/2005.09148>`_.
|
||||
.. warning::
|
||||
|
||||
When GPU is running out of memory during iteration on external memory, user might
|
||||
recieve a segfault instead of an OOM exception.
|
||||
receive a segfault instead of an OOM exception.
|
||||
|
||||
.. _ext_remarks:
|
||||
|
||||
@ -150,7 +150,7 @@ see `this paper <https://arxiv.org/abs/2005.09148>`_.
|
||||
Remarks
|
||||
*******
|
||||
|
||||
When using external memory with XBGoost, data is divided into smaller chunks so that only
|
||||
When using external memory with XGBoost, data is divided into smaller chunks so that only
|
||||
a fraction of it needs to be stored in memory at any given time. It's important to note
|
||||
that this method only applies to the predictor data (``X``), while other data, like labels
|
||||
and internal runtime structures are concatenated. This means that memory reduction is most
|
||||
@ -211,7 +211,7 @@ construction of `QuantileDmatrix` with data chunks. On the other hand, if it's p
|
||||
doesn't fetch data during training. On the other hand, the external memory `DMatrix`
|
||||
fetches data batches from external memory on-demand. Use the `QuantileDMatrix` (with
|
||||
iterator if necessary) when you can fit most of your data in memory. The training would be
|
||||
an order of magnitute faster than using external memory.
|
||||
an order of magnitude faster than using external memory.
|
||||
|
||||
****************
|
||||
Text File Inputs
|
||||
|
||||
@ -233,7 +233,7 @@ This has lead to some interesting implications of feature interaction constraint
|
||||
``[[0, 1], [0, 1, 2], [1, 2]]`` as another example. Assuming we have only 3 available
|
||||
features in our training datasets for presentation purpose, careful readers might have
|
||||
found out that the above constraint is the same as simply ``[[0, 1, 2]]``. Since no matter which
|
||||
feature is chosen for split in the root node, all its descendants are allowd to include every
|
||||
feature is chosen for split in the root node, all its descendants are allowed to include every
|
||||
feature as legitimate split candidates without violating interaction constraints.
|
||||
|
||||
For one last example, we use ``[[0, 1], [1, 3, 4]]`` and choose feature ``0`` as split for
|
||||
|
||||
@ -11,12 +11,12 @@ Learning to Rank
|
||||
********
|
||||
Overview
|
||||
********
|
||||
Often in the context of information retrieval, learning-to-rank aims to train a model that arranges a set of query results into an ordered list `[1] <#references>`__. For surprivised learning-to-rank, the predictors are sample documents encoded as feature matrix, and the labels are relevance degree for each sample. Relevance degree can be multi-level (graded) or binary (relevant or not). The training samples are often grouped by their query index with each query group containing multiple query results.
|
||||
Often in the context of information retrieval, learning-to-rank aims to train a model that arranges a set of query results into an ordered list `[1] <#references>`__. For supervised learning-to-rank, the predictors are sample documents encoded as feature matrix, and the labels are relevance degree for each sample. Relevance degree can be multi-level (graded) or binary (relevant or not). The training samples are often grouped by their query index with each query group containing multiple query results.
|
||||
|
||||
XGBoost implements learning to rank through a set of objective functions and performance metrics. The default objective is ``rank:ndcg`` based on the ``LambdaMART`` `[2] <#references>`__ algorithm, which in turn is an adaptation of the ``LambdaRank`` `[3] <#references>`__ framework to gradient boosting trees. For a history and a summary of the algorithm, see `[5] <#references>`__. The implementation in XGBoost features deterministic GPU computation, distributed training, position debiasing and two different pair construction strategies.
|
||||
|
||||
************************************
|
||||
Training with the Pariwise Objective
|
||||
Training with the Pairwise Objective
|
||||
************************************
|
||||
``LambdaMART`` is a pairwise ranking model, meaning that it compares the relevance degree for every pair of samples in a query group and calculate a proxy gradient for each pair. The default objective ``rank:ndcg`` is using the surrogate gradient derived from the ``ndcg`` metric. To train a XGBoost model, we need an additional sorted array called ``qid`` for specifying the query group of input samples. An example input would look like this:
|
||||
|
||||
@ -59,7 +59,7 @@ Notice that the samples are sorted based on their query index in a non-decreasin
|
||||
X = X[sorted_idx, :]
|
||||
y = y[sorted_idx]
|
||||
|
||||
The simpliest way to train a ranking model is by using the scikit-learn estimator interface. Continuing the previous snippet, we can train a simple ranking model without tuning:
|
||||
The simplest way to train a ranking model is by using the scikit-learn estimator interface. Continuing the previous snippet, we can train a simple ranking model without tuning:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
||||
@ -138,7 +138,7 @@ This will train on four GPUs in parallel.
|
||||
|
||||
Note that it usually does not make sense to allocate more than one GPU per actor,
|
||||
as XGBoost relies on distributed libraries such as Dask or Ray to utilize multi
|
||||
GPU taining.
|
||||
GPU training.
|
||||
|
||||
Setting the number of CPUs per actor
|
||||
====================================
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user