Deterministic data partitioning for external memory (#6317)

* Make external memory data partitioning deterministic.

* Change the meaning of `page_size` from bytes to number of rows.

* Design a data pool.

* Note for external memory.

* Enable unity build on Windows CI.

* Force garbage collect on test.
This commit is contained in:
Jiaming Yuan
2020-11-11 06:11:06 +08:00
committed by GitHub
parent 9564886d9f
commit 43efadea2e
15 changed files with 334 additions and 88 deletions

View File

@@ -549,8 +549,9 @@ class DMatrix {
int max_bin);
virtual DMatrix *Slice(common::Span<int32_t const> ridxs) = 0;
/*! \brief page size 32 MB */
static const size_t kPageSize = 32UL << 20UL;
/*! \brief Number of rows per page in external memory. Approximately 100MB per page for
* dataset with 100 features. */
static const size_t kPageSize = 32UL << 12UL;
protected:
virtual BatchSet<SparsePage> GetRowBatches() = 0;