Disallow multiple roots for tree_method=hist (#1979)

As discussed in issue #1978, tree_method=hist ignores the parameter
param.num_roots; it simply assumes that the tree has only one root. In
particular, when InitData() method initializes row_set_collection_, it simply
assigns all rows to node 0, the value that's hard-coded.

For now, the updater will simply fail when num_roots exceeds 1. I will revise
the updater soon to support multiple roots.
This commit is contained in:
Philip Cho 2017-01-21 12:02:29 -08:00 committed by Tianqi Chen
parent 036ee55fe0
commit 5d74578095
2 changed files with 4 additions and 0 deletions

View File

@ -29,6 +29,7 @@ DMLC_REGISTRY_LINK_TAG(updater_colmaker);
DMLC_REGISTRY_LINK_TAG(updater_skmaker);
DMLC_REGISTRY_LINK_TAG(updater_refresh);
DMLC_REGISTRY_LINK_TAG(updater_prune);
DMLC_REGISTRY_LINK_TAG(updater_fast_hist);
DMLC_REGISTRY_LINK_TAG(updater_histmaker);
DMLC_REGISTRY_LINK_TAG(updater_sync);
} // namespace tree

View File

@ -139,6 +139,9 @@ class FastHistMaker: public TreeUpdater {
tstart = dmlc::GetTime();
this->InitData(gmat, gpair, *p_fmat, *p_tree);
time_init_data = dmlc::GetTime() - tstart;
// FIXME(hcho3): this code is broken when param.num_roots > 1. Please fix it
CHECK_EQ(p_tree->param.num_roots, 1)
<< "tree_method=hist does not support multiple roots at this moment";
for (int nid = 0; nid < p_tree->param.num_roots; ++nid) {
tstart = dmlc::GetTime();
hist_.AddHistRow(nid);