2 Commits

Author SHA1 Message Date
Xu Xiao
ef9af33a00 [HOTFIX] distributed training with hist method (#4716)
* add parallel test for hist.EvalualiteSplit

* update test_openmp.py

* update test_openmp.py

* update test_openmp.py

* update test_openmp.py

* update test_openmp.py

* fix OMP schedule policy

* fix clang-tidy

* add logging: total_num_bins

* fix

* fix

* test

* replace guided OPENMP policy with static in updater_quantile_hist.cc
2019-08-13 11:27:29 -07:00
Philip Cho
2715baef64 Fix bugs in multithreaded ApplySplitSparseData() (#2161)
* Bugfix 1: Fix segfault in multithreaded ApplySplitSparseData()

When there are more threads than rows in rowset, some threads end up
with empty ranges, causing them to crash. (iend - 1 needs to be
accessible as part of algorithm)

Fix: run only those threads with nonempty ranges.

* Add regression test for Bugfix 1

* Moving python_omp_test to existing python test group

Turns out you don't need to set "OMP_NUM_THREADS" to enable
multithreading. Just add nthread parameter.

* Bugfix 2: Fix corner case of ApplySplitSparseData() for categorical feature

When split value is less than all cut points, split_cond is set
incorrectly.

Fix: set split_cond = -1 to indicate this scenario

* Bugfix 3: Initialize data layout indicator before using it

data_layout_ is accessed before being set; this variable determines
whether feature 0 is included in feat_set.

Fix: re-order code in InitData() to initialize data_layout_ first

* Adding regression test for Bugfix 2

Unfortunately, no regression test for Bugfix 3, as there is no
way to deterministically assign value to an uninitialized variable.
2017-04-02 11:37:39 -07:00