Fix spelling in documents (#6948)

* Update roxygen2 doc.

Co-authored-by: fis <jm.yuan@outlook.com>
This commit is contained in:
Andrew Ziem
2021-05-11 06:44:36 -06:00
committed by GitHub
parent 2a9979e256
commit 3e7e426b36
100 changed files with 284 additions and 284 deletions

View File

@@ -158,7 +158,7 @@ The parameter ``aft_loss_distribution`` corresponds to the distribution of the :
Currently, you can choose from three probability distributions for ``aft_loss_distribution``:
========================= ===========================================
``aft_loss_distribution`` Probabilty Density Function (PDF)
``aft_loss_distribution`` Probability Density Function (PDF)
========================= ===========================================
``normal`` :math:`\dfrac{\exp{(-z^2/2)}}{\sqrt{2\pi}}`
``logistic`` :math:`\dfrac{e^z}{(1+e^z)^2}`

View File

@@ -2,7 +2,7 @@
C API Tutorial
##############################
In this tutorial, we are going to install XGBoost library & configure the CMakeLists.txt file of our C/C++ application to link XGBoost library with our application. Later on, we will see some usefull tips for using C API and code snippets as examples to use various functions available in C API to perform basic task like loading, training model & predicting on test dataset.
In this tutorial, we are going to install XGBoost library & configure the CMakeLists.txt file of our C/C++ application to link XGBoost library with our application. Later on, we will see some useful tips for using C API and code snippets as examples to use various functions available in C API to perform basic task like loading, training model & predicting on test dataset.
.. contents::
:backlinks: none
@@ -68,11 +68,11 @@ To ensure that CMake can locate the XGBoost library, supply ``-DCMAKE_PREFIX_PAT
Usefull Tips To Remember
************************
Below are some usefull tips while using C API:
Below are some useful tips while using C API:
1. Error handling: Always check the return value of the C API functions.
a. In a C application: Use the following macro to guard all calls to XGBoost's C API functions. The macro prints all the error/ exception occured:
a. In a C application: Use the following macro to guard all calls to XGBoost's C API functions. The macro prints all the error/ exception occurred:
.. highlight:: c
:linenothreshold: 5

View File

@@ -143,6 +143,6 @@ For fully reproducible source code and comparison plots, see `custom_rmsle.py <h
Multi-class objective function
******************************
A similiar demo for multi-class objective funtion is also available, see
A similar demo for multi-class objective function is also available, see
`demo/guide-python/custom_softmax.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
for details.

View File

@@ -127,7 +127,7 @@ In previous example we used ``DaskDMatrix`` as input to ``predict`` function. I
practice, it's also possible to call ``predict`` function directly on dask collections
like ``Array`` and ``DataFrame`` and might have better prediction performance. When
``DataFrame`` is used as prediction input, the result is a dask ``Series`` instead of
array. Also, there's inplace predict support on dask interface, which can help reducing
array. Also, there's in-place predict support on dask interface, which can help reducing
both memory usage and prediction time.
.. code-block:: python
@@ -479,7 +479,7 @@ Here are some pratices on reducing memory usage with dask and xgboost.
``xgboost.dask.DaskDeviceQuantileDMatrix`` as a drop in replacement for ``DaskDMatrix``
to reduce overall memory usage. See ``demo/dask/gpu_training.py`` for an example.
- Use inplace prediction when possible.
- Use in-place prediction when possible.
References:

View File

@@ -10,7 +10,7 @@ The external memory version takes in the following `URI <https://en.wikipedia.or
filename#cacheprefix
The ``filename`` is the normal path to libsvm format file you want to load in, and
The ``filename`` is the normal path to LIBSVM format file you want to load in, and
``cacheprefix`` is a path to a cache file that XGBoost will use for caching preprocessed
data in binary form.
@@ -24,7 +24,7 @@ where ``label_column`` should point to the csv column acting as the label.
To provide a simple example for illustration, extracting the code from
`demo/guide-python/external_memory.py <https://github.com/dmlc/xgboost/blob/master/demo/guide-python/external_memory.py>`_. If
you have a dataset stored in a file similar to ``agaricus.txt.train`` with libSVM format, the external memory support can be enabled by:
you have a dataset stored in a file similar to ``agaricus.txt.train`` with LIBSVM format, the external memory support can be enabled by:
.. code-block:: python

View File

@@ -5,7 +5,7 @@ Text Input Format of DMatrix
******************
Basic Input Format
******************
XGBoost currently supports two text formats for ingesting data: LibSVM and CSV. The rest of this document will describe the LibSVM format. (See `this Wikipedia article <https://en.wikipedia.org/wiki/Comma-separated_values>`_ for a description of the CSV format.). Please be careful that, XGBoost does **not** understand file extensions, nor try to guess the file format, as there is no universal agreement upon file extension of LibSVM or CSV. Instead it employs `URI <https://en.wikipedia.org/wiki/Uniform_Resource_Identifier>`_ format for specifying the precise input file type. For example if you provide a `csv` file ``./data.train.csv`` as input, XGBoost will blindly use the default libsvm parser to digest it and generate a parser error. Instead, users need to provide an uri in the form of ``train.csv?format=csv``. For external memory input, the uri should of a form similar to ``train.csv?format=csv#dtrain.cache``. See :ref:`python_data_interface` and :doc:`/tutorials/external_memory` also.
XGBoost currently supports two text formats for ingesting data: LIBSVM and CSV. The rest of this document will describe the LIBSVM format. (See `this Wikipedia article <https://en.wikipedia.org/wiki/Comma-separated_values>`_ for a description of the CSV format.). Please be careful that, XGBoost does **not** understand file extensions, nor try to guess the file format, as there is no universal agreement upon file extension of LIBSVM or CSV. Instead it employs `URI <https://en.wikipedia.org/wiki/Uniform_Resource_Identifier>`_ format for specifying the precise input file type. For example if you provide a `csv` file ``./data.train.csv`` as input, XGBoost will blindly use the default LIBSVM parser to digest it and generate a parser error. Instead, users need to provide an URI in the form of ``train.csv?format=csv``. For external memory input, the URI should of a form similar to ``train.csv?format=csv#dtrain.cache``. See :ref:`python_data_interface` and :doc:`/tutorials/external_memory` also.
For training or predicting, XGBoost takes an instance file with the format as below:
@@ -23,7 +23,7 @@ Each line represent a single instance, and in the first line '1' is the instance
******************************************
Auxiliary Files for Additional Information
******************************************
**Note: all information below is applicable only to single-node version of the package.** If you'd like to perform distributed training with multiple nodes, skip to the section `Embedding additional information inside LibSVM file`_.
**Note: all information below is applicable only to single-node version of the package.** If you'd like to perform distributed training with multiple nodes, skip to the section `Embedding additional information inside LIBSVM file`_.
Group Input Format
==================
@@ -72,13 +72,13 @@ XGBoost supports providing each instance an initial margin prediction. For examp
XGBoost will take these values as initial margin prediction and boost from that. An important note about base_margin is that it should be margin prediction before transformation, so if you are doing logistic loss, you will need to put in value before logistic transformation. If you are using XGBoost predictor, use ``pred_margin=1`` to output margin values.
***************************************************
Embedding additional information inside LibSVM file
Embedding additional information inside LIBSVM file
***************************************************
**This section is applicable to both single- and multiple-node settings.**
Query ID Columns
================
This is most useful for `ranking task <https://github.com/dmlc/xgboost/tree/master/demo/rank>`_, where the instances are grouped into query groups. You may embed query group ID for each instance in the LibSVM file by adding a token of form ``qid:xx`` in each row:
This is most useful for `ranking task <https://github.com/dmlc/xgboost/tree/master/demo/rank>`_, where the instances are grouped into query groups. You may embed query group ID for each instance in the LIBSVM file by adding a token of form ``qid:xx`` in each row:
.. code-block:: none
:caption: ``train.txt``
@@ -98,7 +98,7 @@ Keep in mind the following restrictions:
Instance weights
================
You may specify instance weights in the LibSVM file by appending each instance label with the corresponding weight in the form of ``[label]:[weight]``, as shown by the following example:
You may specify instance weights in the LIBSVM file by appending each instance label with the corresponding weight in the form of ``[label]:[weight]``, as shown by the following example:
.. code-block:: none
:caption: ``train.txt``

View File

@@ -1,9 +1,9 @@
#########################
Random Forests in XGBoost
Random Forests(TM) in XGBoost
#########################
XGBoost is normally used to train gradient-boosted decision trees and other gradient
boosted models. Random forests use the same model representation and inference, as
boosted models. Random Forests use the same model representation and inference, as
gradient-boosted decision trees, but a different training algorithm. One can use XGBoost
to train a standalone random forest or use random forest as a base model for gradient
boosting. Here we focus on training standalone random forest.

View File

@@ -31,10 +31,10 @@ part of model, that's because objective controls transformation of global bias (
evaluation or continue the training with a different set of hyper-parameters etc.
However, this is not the end of story. There are cases where we need to save something
more than just the model itself. For example, in distrbuted training, XGBoost performs
more than just the model itself. For example, in distributed training, XGBoost performs
checkpointing operation. Or for some reasons, your favorite distributed computing
framework decide to copy the model from one worker to another and continue the training in
there. In such cases, the serialisation output is required to contain enougth information
there. In such cases, the serialisation output is required to contain enough information
to continue previous training without user providing any parameters again. We consider
such scenario as **memory snapshot** (or memory based serialisation method) and distinguish it
with normal model IO operation. Currently, memory snapshot is used in the following places:
@@ -145,7 +145,7 @@ or in R:
config <- xgb.config(bst)
print(config)
Will print out something similiar to (not actual output as it's too long for demonstration):
Will print out something similar to (not actual output as it's too long for demonstration):
.. code-block:: javascript