Yanbo Liang
9fefa2128d
[jvm-packages] Fix early stop with xgboost4j-spark ( #4176 )
...
* Fix early stop with xgboost4j-spark
* Update XGBoost.java
* Update XGBoost.java
* Update XGBoost.java
To use -Float.MAX_VALUE as the lower bound, in case there is positive metric.
* Only update best score if the current score is better (no update when equal)
* Update xgboost-spark tutorial to fix early stopping docs.
2019-03-01 13:02:57 -08:00
Jiaming Yuan
7ea5675679
Add PushCSC for SparsePage. ( #4193 )
...
* Add PushCSC for SparsePage.
* Move Push* definitions into cc file.
* Add std:: prefix to `size_t` make clang++ happy.
* Address monitor count == 0.
2019-03-02 01:58:08 +08:00
Patrick Ford
74009afcac
Added trees_to_df() method for Booster class ( #4153 )
...
* add test_parse_tree.py to tests/python
* Fix formatting
* Fix pylint error
* Ignore 'no member' error for Pandas dataframe
2019-02-26 13:28:24 -08:00
Nan Zhu
1b7405f688
[jvm-packages] fix comments in objectiveTrait ( #4174 )
2019-02-22 00:32:13 -08:00
Nan Zhu
dc2add96c5
[jvm-packages] upgrade spark version ( #4170 )
2019-02-21 11:51:36 -08:00
Rong Ou
8e0a08fbcf
Update python benchmarking script ( #4164 )
...
* a few tweaks to speed up data generation
* del variable to save memory
* switch to random numpy arrays
2019-02-21 15:16:09 +13:00
Abhai Kollara Dilip
54793544a2
Update README.rst ( #4167 )
...
Fixes error when copy pasting.
2019-02-20 14:46:56 -08:00
Philip Hyunsu Cho
2aaae2e7bb
Fix #4163 : always copy sliced data ( #4165 )
...
* Revert "Accept numpy array view. (#4147 )"
This reverts commit a985a99cf0dacb26a5d734835473d492d3c2a0df.
* Fix #4163 : always copy sliced data
* Remove print() from the test; check shape equality
* Check if 'base' attribute exists
* Fix lint
* Address reviewer comment
* Fix lint
2019-02-20 14:46:34 -08:00
Jiaming Yuan
cecbe0cf71
Fix test_gpu_coordinate. ( #3974 )
...
* Fix test_gpu_coordinate.
* Use `gpu_coord_descent` in test.
* Reduce number of running rounds.
* Remove nthread.
* Use githubusercontent for r-appveyor.
* Use githubusercontent in travis r tests.
2019-02-19 14:09:10 -08:00
Rory Mitchell
c8c472f39a
Fix incorrect device in multi-GPU algorithm ( #4161 )
2019-02-20 09:23:15 +13:00
Nan Zhu
1dac5e2410
more correct way to build node stats in distributed fast hist ( #4140 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* more changes
* temp
* update
* udpate rabit
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* broadcast subsampled feature correctly
* init col
* temp
* col sampling
* fix histmastrix init
* fix col sampling
* remove cout
* fix out of bound access
* fix core dump
remove core dump file
* update
* add fid
* update
* revert some changes
* temp
* temp
* pass all tests
* bring back some tests
* recover some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* recover column init part
* more recovery
* fix core dumps
* code clean
* revert some changes
* fix test compilation issue
* fix lint issue
* resolve compilation issue
* fix issues of lint caused by rebase
* fix stylistic changes and change variable names
* modularize depth width
* address the comments
* fix failed tests
* wrap perf timers with class
* temp
* pass all lossguide
* pass tests
* add comments
* more changes
* use separate flow for single and tests
* add test for lossguide hist
* remove duplications
* syncing stats for only once
* recover more changes
* recover more changes
* fix root-stats
* simplify code
* remove outdated comments
2019-02-18 13:45:30 -08:00
Jiaming Yuan
a985a99cf0
Accept numpy array view. ( #4147 )
...
* Accept array view (slice) in metainfo.
2019-02-18 22:21:34 +08:00
Jiaming Yuan
0ff84d950e
Upgrade rabit. ( #4159 )
2019-02-18 22:16:58 +08:00
Kenichi Nagahara
60f05352c5
Fix typo in demo ( #4156 )
2019-02-18 18:42:41 +08:00
Philip Hyunsu Cho
549c8d6ae9
Prevent empty quantiles in fast hist ( #4155 )
...
* Prevent empty quantiles
* Revise and improve unit tests for quantile hist
* Remove unnecessary comment
* Add #2943 as a test case
* Skip test if no sklearn
* Revise misleading comments
2019-02-17 16:01:07 -08:00
Jiaming Yuan
e1240413c9
Fix gpu_hist apply_split test. ( #4158 )
2019-02-18 02:48:28 +08:00
Jiaming Yuan
2e618af743
Fix cpplint. ( #4157 )
...
* Add comment after #endif.
* Add missing headers.
2019-02-18 00:16:29 +08:00
Rory Mitchell
71a604fae3
Fix for windows compilation ( #4139 )
2019-02-17 19:42:32 +13:00
Jiaming Yuan
1fe874e58a
Fix empty subspan. ( #4151 )
...
* Silent the death tests.
2019-02-17 04:48:03 +08:00
Pasha Stetsenko
ff2d4c99fa
Update datatable usage ( #4123 )
2019-02-17 03:44:09 +08:00
Jiaming Yuan
754fe8142b
Make `HistCutMatrix::Init' be aware of groups. ( #4115 )
...
* Add checks for group size.
* Simple docs.
* Search group index during hist cut matrix initialization.
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-02-16 04:39:41 +08:00
Philip Hyunsu Cho
37ddfd7d6e
Fix broken R test: Install Homebrew GCC ( #4142 )
...
* Fix broken R test: Install Homebrew GCC
Missing GCC Fortran causes installation failure of a dependency package
(igraph)
* Register gfortran system-wide
* Use correct keg
* Set env vars to change compiler choice
* Do not break other Mac builds
* Nuclear option: symlink gfortran
* Use /usr/local/bin instead of /usr/bin
* Symlink library path too
* Update run_test.sh
2019-02-15 07:23:05 -08:00
Rong Ou
d506a8bc63
[jvm-packages] add verbosity param ( #4138 )
2019-02-13 20:57:17 -08:00
Nan Zhu
c18a3660fa
Separate Depthwidth and Lossguide growing policy in fast histogram ( #4102 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* init
* more changes
* temp
* update
* udpate rabit
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* broadcast subsampled feature correctly
* init col
* temp
* col sampling
* fix histmastrix init
* fix col sampling
* remove cout
* fix out of bound access
* fix core dump
remove core dump file
* disbale test temporarily
* update
* add fid
* print perf data
* update
* revert some changes
* temp
* temp
* pass all tests
* bring back some tests
* recover some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* recover column init part
* more recovery
* fix core dumps
* code clean
* revert some changes
* fix test compilation issue
* fix lint issue
* resolve compilation issue
* fix issues of lint caused by rebase
* fix stylistic changes and change variable names
* use regtree internal function
* modularize depth width
* address the comments
* fix failed tests
* wrap perf timers with class
* fix lint
* fix num_leaves count
* fix indention
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.h
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.h
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* merge
* fix compilation
2019-02-13 12:56:19 -08:00
Rong Ou
3be1b9ae30
reformat benchmark_tree.py to get rid of lint errors ( #4126 )
2019-02-13 18:54:56 +13:00
Rong Ou
9b917cda4f
[jvm-packages] fix simple logic error :) ( #4128 )
...
@CodingCat
2019-02-11 21:47:30 -08:00
Philip Hyunsu Cho
99a290489c
Update Python docstring for ranking functions ( #4121 )
...
* Update Python docstring for ranking functions
* Fix formatting
2019-02-10 12:22:02 -08:00
Nan Zhu
3320a52192
[jvm-packages] force use per-group weights in spark layer ( #4118 )
2019-02-10 05:38:03 +08:00
Yuan (Terry) Tang
ba584e5e9f
Add link to InfoWorld 2019 award ( #4116 )
2019-02-08 12:43:23 -08:00
Rong Ou
2a9b085bc8
[jvm-packages] minor fix of params ( #4114 )
2019-02-08 00:21:59 -08:00
Jiaming Yuan
f8ca2960fc
Use nccl group calls to prevent from dead lock. ( #4113 )
...
* launch all reduce sequentially.
* Fix gpu_exact test memory leak.
2019-02-08 06:12:39 +08:00
Nan Zhu
05243642bb
[jvm-packages] better fix for shutdown applications ( #4108 )
...
* intentionally failed task
* throw exception
* more
* stop sparkcontext directly
* stop from another thread
* new scope
* use a new thread
* daemon threads
* don't join the killer thread
* remove injected errors
* add comments
2019-02-07 09:02:17 -08:00
Jiaming Yuan
017c97b8ce
Clean up training code. ( #3825 )
...
* Remove GHistRow, GHistEntry, GHistIndexRow.
* Remove kSimpleStats.
* Remove CheckInfo, SetLeafVec in GradStats and in SKStats.
* Clean up the GradStats.
* Cleanup calcgain.
* Move LossChangeMissing out of common.
* Remove [] operator from GHistIndexBlock.
2019-02-07 14:22:13 +08:00
Nan Zhu
325b16bccd
[jvm-packages] fix return type of setEvalSets ( #4105 )
2019-02-06 11:00:29 -08:00
Nan Zhu
ae3bb9c2d5
Distributed Fast Histogram Algorithm ( #4011 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* init
* allow hist algo
* more changes
* temp
* update
* remove hist sync
* udpate rabit
* change hist size
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* fix lint issue
* broadcast subsampled feature correctly
* revert some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* update docs
2019-02-05 05:12:53 -08:00
Jiaming Yuan
8905df4a18
Perform clang-tidy on both cpp and cuda source. ( #4034 )
...
* Basic script for using compilation database.
* Add `GENERATE_COMPILATION_DATABASE' to CMake.
* Rearrange CMakeLists.txt.
* Add basic python clang-tidy script.
* Remove modernize-use-auto.
* Add clang-tidy to Jenkins
* Refine logic for correct path detection
In Jenkins, the project root is of form /home/ubuntu/workspace/xgboost_PR-XXXX
* Run clang-tidy in CUDA 9.2 container
* Use clang_tidy container
2019-02-05 16:07:43 +08:00
Jiaming Yuan
1088dff42c
Prevent training without setting up caches. ( #4066 )
...
* Prevent training without setting up caches.
* Add warning for internal functions.
* Check number of features.
* Address reviewer's comment.
2019-02-03 01:03:29 -08:00
Philip Hyunsu Cho
7a652a8c64
Speed up Jenkins by not compiling CMake ( #4099 )
2019-02-03 00:08:14 -08:00
tmitanitky
59f868bc60
enable xgb_model in scklearn XGBClassifier and test. ( #4092 )
...
* Enable xgb_model parameter in XGClassifier scikit-learn API
https://github.com/dmlc/xgboost/issues/3049
* add test_XGBClassifier_resume():
test for xgb_model parameter in XGBClassifier API.
* Update test_with_sklearn.py
* Fix lint
2019-01-31 11:29:19 -08:00
Nan Zhu
0d0ce32908
[jvm-packages] adding logs for parameters ( #4091 )
2019-01-30 21:50:55 -08:00
Philip Hyunsu Cho
a60e224484
Add Jenkins status badge ( #4090 )
2019-01-30 14:03:18 -08:00
Nan Zhu
e0094d996e
fix doc about max_depth ( #4078 )
...
* fix doc
* Update doc/parameter.rst
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
2019-01-30 12:53:44 -08:00
Philip Hyunsu Cho
a1c35cadf0
Fix failing Travis CI on Mac ( #4086 )
...
* Fix failing Travis CI on Mac
Use Homebrew Addon + latest Mac image
* Use long command for pytest
* Downgrade OSX image to xcode9.3, to use Java 8
* Install pytest in Python 2 environment
* Remove clang-tidy from Travis
2019-01-30 09:43:57 -08:00
Jiaming Yuan
4fac9874e0
Check booster for dart in feature importance. ( #4073 )
...
* Check booster for dart in feature importance.
2019-01-22 16:03:54 +08:00
Jiaming Yuan
301cef4638
Correct JVM CMake GPU flag. ( #4071 )
2019-01-21 20:36:38 +08:00
Rory Mitchell
1fc37e4749
Require leaf statistics when expanding tree ( #4015 )
...
* Cache left and right gradient sums
* Require leaf statistics when expanding tree
2019-01-17 21:12:20 -08:00
Andy Adinets
0f8af85f64
Fixed single-GPU tests. ( #4053 )
...
- ./testxgboost (without filters) failed if run on a multi-GPU machine because
the memory was allocated on the current device, but device 0
was always passed into LaunchN
2019-01-11 09:33:15 +02:00
Egor Smirnov
5f151c5cf3
Performance optimizations for Intel CPUs ( #3957 )
...
* Initial performance optimizations for xgboost
* remove includes
* revert float->double
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* Check existence of _mm_prefetch and __builtin_prefetch
* Fix lint
2019-01-08 21:08:13 -08:00
KyleLi1985
dade7c3aff
[jvm-packages] Performance consideration and Alignment input parameter of repartition function ( #4049 )
2019-01-07 08:38:05 -08:00
Nan Zhu
773ddbcfcb
[BLOCKING] fix the issue with infrequent feature ( #4045 )
...
* fix the issue with infrequent feature
* handle exception
* use only 2 workers
* address the comments
2019-01-06 16:01:03 -08:00