[Breaking] Require format to be specified in input URI. (#9077)
Previously, we use `libsvm` as default when format is not specified. However, the dmlc data parser is not particularly robust against errors, and the most common type of error is undefined format. Along with which, we will recommend users to use other data loader instead. We will continue the maintenance of the parsers as it's currently used for many internal tests including federated learning.
This commit is contained in:
@@ -138,7 +138,11 @@ XGB_DLL int XGDMatrixCreateFromFile(const char *fname, int silent, DMatrixHandle
|
||||
/*!
|
||||
* \brief load a data matrix
|
||||
* \param config JSON encoded parameters for DMatrix construction. Accepted fields are:
|
||||
* - uri: The URI of the input file.
|
||||
|
||||
* - uri: The URI of the input file. The URI parameter `format` is required when loading text data.
|
||||
* \verbatim embed:rst:leading-asterisk
|
||||
* See :doc:`/tutorials/input_format` for more info.
|
||||
* \endverbatim
|
||||
* - silent (optional): Whether to print message during loading. Default to true.
|
||||
* - data_split_mode (optional): Whether to split by row or column. In distributed mode, the
|
||||
* file is split accordingly; otherwise this is only an indicator on how the file was split
|
||||
|
||||
@@ -566,21 +566,17 @@ class DMatrix {
|
||||
return Info().num_nonzero_ == Info().num_row_ * Info().num_col_;
|
||||
}
|
||||
|
||||
/*!
|
||||
/**
|
||||
* \brief Load DMatrix from URI.
|
||||
*
|
||||
* \param uri The URI of input.
|
||||
* \param silent Whether print information during loading.
|
||||
* \param data_split_mode In distributed mode, split the input according this mode; otherwise,
|
||||
* it's just an indicator on how the input was split beforehand.
|
||||
* \param file_format The format type of the file, used for dmlc::Parser::Create.
|
||||
* By default "auto" will be able to load in both local binary file.
|
||||
* \param page_size Page size for external memory.
|
||||
* \return The created DMatrix.
|
||||
*/
|
||||
static DMatrix* Load(const std::string& uri,
|
||||
bool silent = true,
|
||||
DataSplitMode data_split_mode = DataSplitMode::kRow,
|
||||
const std::string& file_format = "auto");
|
||||
static DMatrix* Load(const std::string& uri, bool silent = true,
|
||||
DataSplitMode data_split_mode = DataSplitMode::kRow);
|
||||
|
||||
/**
|
||||
* \brief Creates a new DMatrix from an external data adapter.
|
||||
|
||||
Reference in New Issue
Block a user