Fix spelling in documents (#6948)
* Update roxygen2 doc. Co-authored-by: fis <jm.yuan@outlook.com>
This commit is contained in:
@@ -158,7 +158,7 @@ The parameter ``aft_loss_distribution`` corresponds to the distribution of the :
|
||||
Currently, you can choose from three probability distributions for ``aft_loss_distribution``:
|
||||
|
||||
========================= ===========================================
|
||||
``aft_loss_distribution`` Probabilty Density Function (PDF)
|
||||
``aft_loss_distribution`` Probability Density Function (PDF)
|
||||
========================= ===========================================
|
||||
``normal`` :math:`\dfrac{\exp{(-z^2/2)}}{\sqrt{2\pi}}`
|
||||
``logistic`` :math:`\dfrac{e^z}{(1+e^z)^2}`
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
C API Tutorial
|
||||
##############################
|
||||
|
||||
In this tutorial, we are going to install XGBoost library & configure the CMakeLists.txt file of our C/C++ application to link XGBoost library with our application. Later on, we will see some usefull tips for using C API and code snippets as examples to use various functions available in C API to perform basic task like loading, training model & predicting on test dataset.
|
||||
In this tutorial, we are going to install XGBoost library & configure the CMakeLists.txt file of our C/C++ application to link XGBoost library with our application. Later on, we will see some useful tips for using C API and code snippets as examples to use various functions available in C API to perform basic task like loading, training model & predicting on test dataset.
|
||||
|
||||
.. contents::
|
||||
:backlinks: none
|
||||
@@ -68,11 +68,11 @@ To ensure that CMake can locate the XGBoost library, supply ``-DCMAKE_PREFIX_PAT
|
||||
Usefull Tips To Remember
|
||||
************************
|
||||
|
||||
Below are some usefull tips while using C API:
|
||||
Below are some useful tips while using C API:
|
||||
|
||||
1. Error handling: Always check the return value of the C API functions.
|
||||
|
||||
a. In a C application: Use the following macro to guard all calls to XGBoost's C API functions. The macro prints all the error/ exception occured:
|
||||
a. In a C application: Use the following macro to guard all calls to XGBoost's C API functions. The macro prints all the error/ exception occurred:
|
||||
|
||||
.. highlight:: c
|
||||
:linenothreshold: 5
|
||||
|
||||
@@ -143,6 +143,6 @@ For fully reproducible source code and comparison plots, see `custom_rmsle.py <h
|
||||
Multi-class objective function
|
||||
******************************
|
||||
|
||||
A similiar demo for multi-class objective funtion is also available, see
|
||||
A similar demo for multi-class objective function is also available, see
|
||||
`demo/guide-python/custom_softmax.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
|
||||
for details.
|
||||
|
||||
@@ -127,7 +127,7 @@ In previous example we used ``DaskDMatrix`` as input to ``predict`` function. I
|
||||
practice, it's also possible to call ``predict`` function directly on dask collections
|
||||
like ``Array`` and ``DataFrame`` and might have better prediction performance. When
|
||||
``DataFrame`` is used as prediction input, the result is a dask ``Series`` instead of
|
||||
array. Also, there's inplace predict support on dask interface, which can help reducing
|
||||
array. Also, there's in-place predict support on dask interface, which can help reducing
|
||||
both memory usage and prediction time.
|
||||
|
||||
.. code-block:: python
|
||||
@@ -479,7 +479,7 @@ Here are some pratices on reducing memory usage with dask and xgboost.
|
||||
``xgboost.dask.DaskDeviceQuantileDMatrix`` as a drop in replacement for ``DaskDMatrix``
|
||||
to reduce overall memory usage. See ``demo/dask/gpu_training.py`` for an example.
|
||||
|
||||
- Use inplace prediction when possible.
|
||||
- Use in-place prediction when possible.
|
||||
|
||||
References:
|
||||
|
||||
|
||||
@@ -10,7 +10,7 @@ The external memory version takes in the following `URI <https://en.wikipedia.or
|
||||
|
||||
filename#cacheprefix
|
||||
|
||||
The ``filename`` is the normal path to libsvm format file you want to load in, and
|
||||
The ``filename`` is the normal path to LIBSVM format file you want to load in, and
|
||||
``cacheprefix`` is a path to a cache file that XGBoost will use for caching preprocessed
|
||||
data in binary form.
|
||||
|
||||
@@ -24,7 +24,7 @@ where ``label_column`` should point to the csv column acting as the label.
|
||||
|
||||
To provide a simple example for illustration, extracting the code from
|
||||
`demo/guide-python/external_memory.py <https://github.com/dmlc/xgboost/blob/master/demo/guide-python/external_memory.py>`_. If
|
||||
you have a dataset stored in a file similar to ``agaricus.txt.train`` with libSVM format, the external memory support can be enabled by:
|
||||
you have a dataset stored in a file similar to ``agaricus.txt.train`` with LIBSVM format, the external memory support can be enabled by:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
||||
@@ -5,7 +5,7 @@ Text Input Format of DMatrix
|
||||
******************
|
||||
Basic Input Format
|
||||
******************
|
||||
XGBoost currently supports two text formats for ingesting data: LibSVM and CSV. The rest of this document will describe the LibSVM format. (See `this Wikipedia article <https://en.wikipedia.org/wiki/Comma-separated_values>`_ for a description of the CSV format.). Please be careful that, XGBoost does **not** understand file extensions, nor try to guess the file format, as there is no universal agreement upon file extension of LibSVM or CSV. Instead it employs `URI <https://en.wikipedia.org/wiki/Uniform_Resource_Identifier>`_ format for specifying the precise input file type. For example if you provide a `csv` file ``./data.train.csv`` as input, XGBoost will blindly use the default libsvm parser to digest it and generate a parser error. Instead, users need to provide an uri in the form of ``train.csv?format=csv``. For external memory input, the uri should of a form similar to ``train.csv?format=csv#dtrain.cache``. See :ref:`python_data_interface` and :doc:`/tutorials/external_memory` also.
|
||||
XGBoost currently supports two text formats for ingesting data: LIBSVM and CSV. The rest of this document will describe the LIBSVM format. (See `this Wikipedia article <https://en.wikipedia.org/wiki/Comma-separated_values>`_ for a description of the CSV format.). Please be careful that, XGBoost does **not** understand file extensions, nor try to guess the file format, as there is no universal agreement upon file extension of LIBSVM or CSV. Instead it employs `URI <https://en.wikipedia.org/wiki/Uniform_Resource_Identifier>`_ format for specifying the precise input file type. For example if you provide a `csv` file ``./data.train.csv`` as input, XGBoost will blindly use the default LIBSVM parser to digest it and generate a parser error. Instead, users need to provide an URI in the form of ``train.csv?format=csv``. For external memory input, the URI should of a form similar to ``train.csv?format=csv#dtrain.cache``. See :ref:`python_data_interface` and :doc:`/tutorials/external_memory` also.
|
||||
|
||||
For training or predicting, XGBoost takes an instance file with the format as below:
|
||||
|
||||
@@ -23,7 +23,7 @@ Each line represent a single instance, and in the first line '1' is the instance
|
||||
******************************************
|
||||
Auxiliary Files for Additional Information
|
||||
******************************************
|
||||
**Note: all information below is applicable only to single-node version of the package.** If you'd like to perform distributed training with multiple nodes, skip to the section `Embedding additional information inside LibSVM file`_.
|
||||
**Note: all information below is applicable only to single-node version of the package.** If you'd like to perform distributed training with multiple nodes, skip to the section `Embedding additional information inside LIBSVM file`_.
|
||||
|
||||
Group Input Format
|
||||
==================
|
||||
@@ -72,13 +72,13 @@ XGBoost supports providing each instance an initial margin prediction. For examp
|
||||
XGBoost will take these values as initial margin prediction and boost from that. An important note about base_margin is that it should be margin prediction before transformation, so if you are doing logistic loss, you will need to put in value before logistic transformation. If you are using XGBoost predictor, use ``pred_margin=1`` to output margin values.
|
||||
|
||||
***************************************************
|
||||
Embedding additional information inside LibSVM file
|
||||
Embedding additional information inside LIBSVM file
|
||||
***************************************************
|
||||
**This section is applicable to both single- and multiple-node settings.**
|
||||
|
||||
Query ID Columns
|
||||
================
|
||||
This is most useful for `ranking task <https://github.com/dmlc/xgboost/tree/master/demo/rank>`_, where the instances are grouped into query groups. You may embed query group ID for each instance in the LibSVM file by adding a token of form ``qid:xx`` in each row:
|
||||
This is most useful for `ranking task <https://github.com/dmlc/xgboost/tree/master/demo/rank>`_, where the instances are grouped into query groups. You may embed query group ID for each instance in the LIBSVM file by adding a token of form ``qid:xx`` in each row:
|
||||
|
||||
.. code-block:: none
|
||||
:caption: ``train.txt``
|
||||
@@ -98,7 +98,7 @@ Keep in mind the following restrictions:
|
||||
|
||||
Instance weights
|
||||
================
|
||||
You may specify instance weights in the LibSVM file by appending each instance label with the corresponding weight in the form of ``[label]:[weight]``, as shown by the following example:
|
||||
You may specify instance weights in the LIBSVM file by appending each instance label with the corresponding weight in the form of ``[label]:[weight]``, as shown by the following example:
|
||||
|
||||
.. code-block:: none
|
||||
:caption: ``train.txt``
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
#########################
|
||||
Random Forests in XGBoost
|
||||
Random Forests(TM) in XGBoost
|
||||
#########################
|
||||
|
||||
XGBoost is normally used to train gradient-boosted decision trees and other gradient
|
||||
boosted models. Random forests use the same model representation and inference, as
|
||||
boosted models. Random Forests use the same model representation and inference, as
|
||||
gradient-boosted decision trees, but a different training algorithm. One can use XGBoost
|
||||
to train a standalone random forest or use random forest as a base model for gradient
|
||||
boosting. Here we focus on training standalone random forest.
|
||||
|
||||
@@ -31,10 +31,10 @@ part of model, that's because objective controls transformation of global bias (
|
||||
evaluation or continue the training with a different set of hyper-parameters etc.
|
||||
|
||||
However, this is not the end of story. There are cases where we need to save something
|
||||
more than just the model itself. For example, in distrbuted training, XGBoost performs
|
||||
more than just the model itself. For example, in distributed training, XGBoost performs
|
||||
checkpointing operation. Or for some reasons, your favorite distributed computing
|
||||
framework decide to copy the model from one worker to another and continue the training in
|
||||
there. In such cases, the serialisation output is required to contain enougth information
|
||||
there. In such cases, the serialisation output is required to contain enough information
|
||||
to continue previous training without user providing any parameters again. We consider
|
||||
such scenario as **memory snapshot** (or memory based serialisation method) and distinguish it
|
||||
with normal model IO operation. Currently, memory snapshot is used in the following places:
|
||||
@@ -145,7 +145,7 @@ or in R:
|
||||
config <- xgb.config(bst)
|
||||
print(config)
|
||||
|
||||
Will print out something similiar to (not actual output as it's too long for demonstration):
|
||||
Will print out something similar to (not actual output as it's too long for demonstration):
|
||||
|
||||
.. code-block:: javascript
|
||||
|
||||
|
||||
Reference in New Issue
Block a user