Update document for model dump. (#5818)
* Clarify the relationship between dump and save. * Mention the schema.
This commit is contained in:
parent
26143ad0b1
commit
8104f10328
@ -112,7 +112,7 @@ configuration directly as a JSON string. In Python package:
|
|||||||
print(config)
|
print(config)
|
||||||
|
|
||||||
|
|
||||||
or
|
or in R:
|
||||||
|
|
||||||
.. code-block:: R
|
.. code-block:: R
|
||||||
|
|
||||||
@ -158,22 +158,9 @@ Will print out something similiar to (not actual output as it's too long for dem
|
|||||||
"colsample_bynode": "1",
|
"colsample_bynode": "1",
|
||||||
"colsample_bytree": "1",
|
"colsample_bytree": "1",
|
||||||
"default_direction": "learn",
|
"default_direction": "learn",
|
||||||
"enable_feature_grouping": "0",
|
|
||||||
"eta": "0.300000012",
|
...
|
||||||
"gamma": "0",
|
|
||||||
"grow_policy": "depthwise",
|
|
||||||
"interaction_constraints": "",
|
|
||||||
"lambda": "1",
|
|
||||||
"learning_rate": "0.300000012",
|
|
||||||
"max_bin": "256",
|
|
||||||
"max_conflict_rate": "0",
|
|
||||||
"max_delta_step": "0",
|
|
||||||
"max_depth": "6",
|
|
||||||
"max_leaves": "0",
|
|
||||||
"max_search_group": "100",
|
|
||||||
"refresh_leaf": "1",
|
|
||||||
"sketch_eps": "0.0299999993",
|
|
||||||
"sketch_ratio": "2",
|
|
||||||
"subsample": "1"
|
"subsample": "1"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -207,13 +194,16 @@ This way users can study the internal representation more closely. Please note
|
|||||||
JSON generators make use of locale dependent floating point serialization methods, which
|
JSON generators make use of locale dependent floating point serialization methods, which
|
||||||
is not supported by XGBoost.
|
is not supported by XGBoost.
|
||||||
|
|
||||||
************
|
*************************************************
|
||||||
Future Plans
|
Difference between saving model and dumping model
|
||||||
************
|
*************************************************
|
||||||
|
|
||||||
Right now using the JSON format incurs longer serialisation time, we have been working on
|
XGBoost has a function called ``dump_model`` in Booster object, which lets you to export
|
||||||
optimizing the JSON implementation to close the gap between binary format and JSON format.
|
the model in a readable format like ``text``, ``json`` or ``dot`` (graphviz). The primary
|
||||||
You can track the progress in `#5046 <https://github.com/dmlc/xgboost/pull/5046>`_.
|
use case for it is for model interpretation or visualization, and is not supposed to be
|
||||||
|
loaded back to XGBoost. The JSON version has a `schema
|
||||||
|
<https://github.com/dmlc/xgboost/blob/master/doc/dump.schema>`_. See next section for
|
||||||
|
more info.
|
||||||
|
|
||||||
***********
|
***********
|
||||||
JSON Schema
|
JSON Schema
|
||||||
@ -229,3 +219,10 @@ leaf directly, instead it saves the weights as a separated array.
|
|||||||
|
|
||||||
.. include:: ../model.schema
|
.. include:: ../model.schema
|
||||||
:code: json
|
:code: json
|
||||||
|
|
||||||
|
************
|
||||||
|
Future Plans
|
||||||
|
************
|
||||||
|
|
||||||
|
Right now using the JSON format incurs longer serialisation time, we have been working on
|
||||||
|
optimizing the JSON implementation to close the gap between binary format and JSON format.
|
||||||
|
|||||||
@ -1444,8 +1444,11 @@ class Booster(object):
|
|||||||
|
|
||||||
The model is saved in an XGBoost internal format which is universal
|
The model is saved in an XGBoost internal format which is universal
|
||||||
among the various XGBoost interfaces. Auxiliary attributes of the
|
among the various XGBoost interfaces. Auxiliary attributes of the
|
||||||
Python Booster object (such as feature_names) will not be saved. To
|
Python Booster object (such as feature_names) will not be saved. See:
|
||||||
preserve all attributes, pickle the Booster object.
|
|
||||||
|
https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html
|
||||||
|
|
||||||
|
for more info.
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
@ -1460,7 +1463,7 @@ class Booster(object):
|
|||||||
raise TypeError("fname must be a string or os_PathLike")
|
raise TypeError("fname must be a string or os_PathLike")
|
||||||
|
|
||||||
def save_raw(self):
|
def save_raw(self):
|
||||||
"""Save the model to a in memory buffer representation
|
"""Save the model to a in memory buffer representation instead of file.
|
||||||
|
|
||||||
Returns
|
Returns
|
||||||
-------
|
-------
|
||||||
@ -1479,8 +1482,11 @@ class Booster(object):
|
|||||||
|
|
||||||
The model is loaded from an XGBoost format which is universal among the
|
The model is loaded from an XGBoost format which is universal among the
|
||||||
various XGBoost interfaces. Auxiliary attributes of the Python Booster
|
various XGBoost interfaces. Auxiliary attributes of the Python Booster
|
||||||
object (such as feature_names) will not be loaded. To preserve all
|
object (such as feature_names) will not be loaded. See:
|
||||||
attributes, pickle the Booster object.
|
|
||||||
|
https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html
|
||||||
|
|
||||||
|
for more info.
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
@ -1503,7 +1509,9 @@ class Booster(object):
|
|||||||
raise TypeError('Unknown file type: ', fname)
|
raise TypeError('Unknown file type: ', fname)
|
||||||
|
|
||||||
def dump_model(self, fout, fmap='', with_stats=False, dump_format="text"):
|
def dump_model(self, fout, fmap='', with_stats=False, dump_format="text"):
|
||||||
"""Dump model into a text or JSON file.
|
"""Dump model into a text or JSON file. Unlike `save_model`, the
|
||||||
|
output format is primarily used for visualization or interpretation,
|
||||||
|
hence it's more human readable but cannot be loaded back to XGBoost.
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
@ -1537,7 +1545,9 @@ class Booster(object):
|
|||||||
fout.close()
|
fout.close()
|
||||||
|
|
||||||
def get_dump(self, fmap='', with_stats=False, dump_format="text"):
|
def get_dump(self, fmap='', with_stats=False, dump_format="text"):
|
||||||
"""Returns the model dump as a list of strings.
|
"""Returns the model dump as a list of strings. Unlike `save_model`, the
|
||||||
|
output format is primarily used for visualization or interpretation,
|
||||||
|
hence it's more human readable but cannot be loaded back to XGBoost.
|
||||||
|
|
||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
@ -1547,6 +1557,7 @@ class Booster(object):
|
|||||||
Controls whether the split statistics are output.
|
Controls whether the split statistics are output.
|
||||||
dump_format : string, optional
|
dump_format : string, optional
|
||||||
Format of model dump. Can be 'text', 'json' or 'dot'.
|
Format of model dump. Can be 'text', 'json' or 'dot'.
|
||||||
|
|
||||||
"""
|
"""
|
||||||
fmap = os_fspath(fmap)
|
fmap = os_fspath(fmap)
|
||||||
length = c_bst_ulong()
|
length = c_bst_ulong()
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user