Philip Hyunsu Cho
f4e7b707c9
Revert #4529 ( #5008 )
...
* Revert " Optimize ‘hist’ for multi-core CPU (#4529 )"
This reverts commit 4d6590be3c9a043d44d9e4fe0a456a9f8179ec72.
* Fix build
2019-11-12 09:35:03 -08:00
Jiaming Yuan
f0064c07ab
Refactor configuration [Part II]. ( #4577 )
...
* Refactor configuration [Part II].
* General changes:
** Remove `Init` methods to avoid ambiguity.
** Remove `Configure(std::map<>)` to avoid redundant copying and prepare for
parameter validation. (`std::vector` is returned from `InitAllowUnknown`).
** Add name to tree updaters for easier debugging.
* Learner changes:
** Make `LearnerImpl` the only source of configuration.
All configurations are stored and carried out by `LearnerImpl::Configure()`.
** Remove booster in C API.
Originally kept for "compatibility reason", but did not state why. So here
we just remove it.
** Add a `metric_names_` field in `LearnerImpl`.
** Remove `LazyInit`. Configuration will always be lazy.
** Run `Configure` before every iteration.
* Predictor changes:
** Allocate both cpu and gpu predictor.
** Remove cpu_predictor from gpu_predictor.
`GBTree` is now used to dispatch the predictor.
** Remove some GPU Predictor tests.
* IO
No IO changes. The binary model format stability is tested by comparing
hashing value of save models between two commits
2019-07-20 08:34:56 -04:00
Jiaming Yuan
d9a47794a5
Fix CPU hist init for sparse dataset. ( #4625 )
...
* Fix CPU hist init for sparse dataset.
* Implement sparse histogram cut.
* Allow empty features.
* Fix windows build, don't use sparse in distributed environment.
* Comments.
* Smaller threshold.
* Fix windows omp.
* Fix msvc lambda capture.
* Fix MSVC macro.
* Fix MSVC initialization list.
* Fix MSVC initialization list x2.
* Preserve categorical feature behavior.
* Rename matrix to sparse cuts.
* Reuse UseGroup.
* Check for categorical data when adding cut.
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* Sanity check.
* Fix comments.
* Fix comment.
2019-07-04 16:27:03 -07:00
Egor Smirnov
4d6590be3c
Optimize ‘hist’ for multi-core CPU ( #4529 )
...
* Initial performance optimizations for xgboost
* remove includes
* revert float->double
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* Check existence of _mm_prefetch and __builtin_prefetch
* Fix lint
* optimizations for CPU
* appling comments in review
* add some comments, code refactoring
* fixing issues in CI
* adding runtime checks
* remove 1 extra check
* remove extra checks in BuildHist
* remove checks
* add debug info
* added debug info
* revert changes
* added comments
* Apply suggestions from code review
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* apply review comments
* Remove unused function CreateNewNodes()
* Add descriptive comment on node_idx variable in QuantileHistMaker::Builder::BuildHistsBatch()
2019-06-27 11:33:49 -07:00
Egor Smirnov
711397d645
Optimizations of pre-processing for 'hist' tree method ( #4310 )
...
* oprimizations for pre-processing
* code cleaning
* code cleaning
* code cleaning after review
* Apply suggestions from code review
Co-Authored-By: SmirnovEgorRu <egor.smirnov@intel.com>
2019-04-16 17:36:19 -07:00
Jiaming Yuan
09bd9e68cf
Use Monitor in quantile hist. ( #4273 )
2019-03-20 09:26:22 +08:00
Nan Zhu
1dac5e2410
more correct way to build node stats in distributed fast hist ( #4140 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* more changes
* temp
* update
* udpate rabit
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* broadcast subsampled feature correctly
* init col
* temp
* col sampling
* fix histmastrix init
* fix col sampling
* remove cout
* fix out of bound access
* fix core dump
remove core dump file
* update
* add fid
* update
* revert some changes
* temp
* temp
* pass all tests
* bring back some tests
* recover some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* recover column init part
* more recovery
* fix core dumps
* code clean
* revert some changes
* fix test compilation issue
* fix lint issue
* resolve compilation issue
* fix issues of lint caused by rebase
* fix stylistic changes and change variable names
* modularize depth width
* address the comments
* fix failed tests
* wrap perf timers with class
* temp
* pass all lossguide
* pass tests
* add comments
* more changes
* use separate flow for single and tests
* add test for lossguide hist
* remove duplications
* syncing stats for only once
* recover more changes
* recover more changes
* fix root-stats
* simplify code
* remove outdated comments
2019-02-18 13:45:30 -08:00
Jiaming Yuan
2e618af743
Fix cpplint. ( #4157 )
...
* Add comment after #endif.
* Add missing headers.
2019-02-18 00:16:29 +08:00
Nan Zhu
c18a3660fa
Separate Depthwidth and Lossguide growing policy in fast histogram ( #4102 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* init
* more changes
* temp
* update
* udpate rabit
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* broadcast subsampled feature correctly
* init col
* temp
* col sampling
* fix histmastrix init
* fix col sampling
* remove cout
* fix out of bound access
* fix core dump
remove core dump file
* disbale test temporarily
* update
* add fid
* print perf data
* update
* revert some changes
* temp
* temp
* pass all tests
* bring back some tests
* recover some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* recover column init part
* more recovery
* fix core dumps
* code clean
* revert some changes
* fix test compilation issue
* fix lint issue
* resolve compilation issue
* fix issues of lint caused by rebase
* fix stylistic changes and change variable names
* use regtree internal function
* modularize depth width
* address the comments
* fix failed tests
* wrap perf timers with class
* fix lint
* fix num_leaves count
* fix indention
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.h
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.h
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* merge
* fix compilation
2019-02-13 12:56:19 -08:00
Jiaming Yuan
017c97b8ce
Clean up training code. ( #3825 )
...
* Remove GHistRow, GHistEntry, GHistIndexRow.
* Remove kSimpleStats.
* Remove CheckInfo, SetLeafVec in GradStats and in SKStats.
* Clean up the GradStats.
* Cleanup calcgain.
* Move LossChangeMissing out of common.
* Remove [] operator from GHistIndexBlock.
2019-02-07 14:22:13 +08:00
Nan Zhu
ae3bb9c2d5
Distributed Fast Histogram Algorithm ( #4011 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* init
* allow hist algo
* more changes
* temp
* update
* remove hist sync
* udpate rabit
* change hist size
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* fix lint issue
* broadcast subsampled feature correctly
* revert some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* update docs
2019-02-05 05:12:53 -08:00
Jiaming Yuan
19ee0a3579
Refactor fast-hist, add tests for some updaters. ( #3836 )
...
Add unittest for prune.
Add unittest for refresh.
Refactor fast_hist.
* Remove fast_hist_param.
* Rename to quantile_hist.
Add unittests for QuantileHist.
* Refactor QuantileHist into .h and .cc file.
* Remove sync.h.
* Remove MGPU_mock test.
Rename fast hist method to quantile hist.
2018-11-07 21:15:07 +13:00