Add qid like ranklib format (#2749)

* add qid for https://github.com/dmlc/xgboost/issues/2748

* change names

* change spaces

* change qid to bst_uint type

* change qid type to size_t

* change qid first to SIZE_MAX

* change qid type from size_t to uint64_t

* update dmlc-core

* fix qids name error

* fix group_ptr_ error

* Style fix

* Add qid handling logic to SparsePage

* New MetaInfo format + backward compatibility fix

Old MetaInfo format (1.0) doesn't contain qid field. We still want to be able
to read from MetaInfo files saved in old format. Also, define a new format
(2.0) that contains the qid field. This way, we can distinguish files that
contain qid and those that do not.

* Update MetaInfo test

* Simply group assignment logic

* Explicitly set qid=nullptr in NativeDataIter

NativeDataIter's callback does not support qid field. Users of NativeDataIter
will need to call setGroup() function separately to set group information.

* Save qids_ in SaveBinary()

* Upgrade dmlc-core submodule

* Add a test for reading qid

* Add contributor

* Check the size of qids_

* Document qid format
This commit is contained in:
liuliang01
2018-07-01 04:24:03 +08:00
committed by Philip Hyunsu Cho
parent 18813a26ab
commit 0cf88d036f
11 changed files with 182 additions and 15 deletions

View File

@@ -53,6 +53,8 @@ class MetaInfo {
std::vector<bst_uint> group_ptr_;
/*! \brief weights of each instance, optional */
std::vector<bst_float> weights_;
/*! \brief session-id of each instance, optional */
std::vector<uint64_t> qids_;
/*!
* \brief initialized margins,
* if specified, xgboost will start from this init margin
@@ -60,7 +62,9 @@ class MetaInfo {
*/
std::vector<bst_float> base_margin_;
/*! \brief version flag, used to check version of this info */
static const int kVersion = 1;
static const int kVersion = 2;
/*! \brief version that introduced qid field */
static const int kVersionQidAdded = 2;
/*! \brief default constructor */
MetaInfo() = default;
/*!
@@ -136,6 +140,9 @@ struct Entry {
inline static bool CmpValue(const Entry& a, const Entry& b) {
return a.fvalue < b.fvalue;
}
inline bool operator==(const Entry& other) const {
return (this->index == other.index && this->fvalue == other.fvalue);
}
};
/*!