xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	99a290489c	Update Python docstring for ranking functions (#4121 ) * Update Python docstring for ranking functions * Fix formatting	2019-02-10 12:22:02 -08:00
Nan Zhu	3320a52192	[jvm-packages] force use per-group weights in spark layer (#4118 )	2019-02-10 05:38:03 +08:00
Yuan (Terry) Tang	ba584e5e9f	Add link to InfoWorld 2019 award (#4116 )	2019-02-08 12:43:23 -08:00
Rong Ou	2a9b085bc8	[jvm-packages] minor fix of params (#4114 )	2019-02-08 00:21:59 -08:00
Jiaming Yuan	f8ca2960fc	Use nccl group calls to prevent from dead lock. (#4113 ) * launch all reduce sequentially. * Fix gpu_exact test memory leak.	2019-02-08 06:12:39 +08:00
Nan Zhu	05243642bb	[jvm-packages] better fix for shutdown applications (#4108 ) * intentionally failed task * throw exception * more * stop sparkcontext directly * stop from another thread * new scope * use a new thread * daemon threads * don't join the killer thread * remove injected errors * add comments	2019-02-07 09:02:17 -08:00
Jiaming Yuan	017c97b8ce	Clean up training code. (#3825 ) * Remove GHistRow, GHistEntry, GHistIndexRow. * Remove kSimpleStats. * Remove CheckInfo, SetLeafVec in GradStats and in SKStats. * Clean up the GradStats. * Cleanup calcgain. * Move LossChangeMissing out of common. * Remove [] operator from GHistIndexBlock.	2019-02-07 14:22:13 +08:00
Nan Zhu	325b16bccd	[jvm-packages] fix return type of setEvalSets (#4105 )	2019-02-06 11:00:29 -08:00
Nan Zhu	ae3bb9c2d5	Distributed Fast Histogram Algorithm (#4011 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * allow hist algo * more changes * temp * update * remove hist sync * udpate rabit * change hist size * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * fix lint issue * broadcast subsampled feature correctly * revert some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * update docs	2019-02-05 05:12:53 -08:00
Jiaming Yuan	8905df4a18	Perform clang-tidy on both cpp and cuda source. (#4034 ) * Basic script for using compilation database. * Add `GENERATE_COMPILATION_DATABASE' to CMake. * Rearrange CMakeLists.txt. * Add basic python clang-tidy script. * Remove modernize-use-auto. * Add clang-tidy to Jenkins * Refine logic for correct path detection In Jenkins, the project root is of form /home/ubuntu/workspace/xgboost_PR-XXXX * Run clang-tidy in CUDA 9.2 container * Use clang_tidy container	2019-02-05 16:07:43 +08:00
Jiaming Yuan	1088dff42c	Prevent training without setting up caches. (#4066 ) * Prevent training without setting up caches. * Add warning for internal functions. * Check number of features. * Address reviewer's comment.	2019-02-03 01:03:29 -08:00
Philip Hyunsu Cho	7a652a8c64	Speed up Jenkins by not compiling CMake (#4099 )	2019-02-03 00:08:14 -08:00
tmitanitky	59f868bc60	enable xgb_model in scklearn XGBClassifier and test. (#4092 ) * Enable xgb_model parameter in XGClassifier scikit-learn API https://github.com/dmlc/xgboost/issues/3049 * add test_XGBClassifier_resume(): test for xgb_model parameter in XGBClassifier API. * Update test_with_sklearn.py * Fix lint	2019-01-31 11:29:19 -08:00
Nan Zhu	0d0ce32908	[jvm-packages] adding logs for parameters (#4091 )	2019-01-30 21:50:55 -08:00
Philip Hyunsu Cho	a60e224484	Add Jenkins status badge (#4090 )	2019-01-30 14:03:18 -08:00
Nan Zhu	e0094d996e	fix doc about max_depth (#4078 ) * fix doc * Update doc/parameter.rst Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>	2019-01-30 12:53:44 -08:00
Philip Hyunsu Cho	a1c35cadf0	Fix failing Travis CI on Mac (#4086 ) * Fix failing Travis CI on Mac Use Homebrew Addon + latest Mac image * Use long command for pytest * Downgrade OSX image to xcode9.3, to use Java 8 * Install pytest in Python 2 environment * Remove clang-tidy from Travis	2019-01-30 09:43:57 -08:00
Jiaming Yuan	4fac9874e0	Check booster for dart in feature importance. (#4073 ) * Check booster for dart in feature importance.	2019-01-22 16:03:54 +08:00
Jiaming Yuan	301cef4638	Correct JVM CMake GPU flag. (#4071 )	2019-01-21 20:36:38 +08:00
Rory Mitchell	1fc37e4749	Require leaf statistics when expanding tree (#4015 ) * Cache left and right gradient sums * Require leaf statistics when expanding tree	2019-01-17 21:12:20 -08:00
Andy Adinets	0f8af85f64	Fixed single-GPU tests. (#4053 ) - ./testxgboost (without filters) failed if run on a multi-GPU machine because the memory was allocated on the current device, but device 0 was always passed into LaunchN	2019-01-11 09:33:15 +02:00
Egor Smirnov	5f151c5cf3	Performance optimizations for Intel CPUs (#3957 ) * Initial performance optimizations for xgboost * remove includes * revert float->double * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * fix for CI * Check existence of _mm_prefetch and __builtin_prefetch * Fix lint	2019-01-08 21:08:13 -08:00
KyleLi1985	dade7c3aff	[jvm-packages] Performance consideration and Alignment input parameter of repartition function (#4049 )	2019-01-07 08:38:05 -08:00
Nan Zhu	773ddbcfcb	[BLOCKING] fix the issue with infrequent feature (#4045 ) * fix the issue with infrequent feature * handle exception * use only 2 workers * address the comments	2019-01-06 16:01:03 -08:00
Nan Zhu	e290ec9a80	[jvm-packages] fix safe execution (#4046 )	2019-01-05 19:45:37 -08:00
Kodi Arfer	6a569b8cd9	Avoid generating NaNs in UnwoundPathSum (#3943 ) * Avoid generating NaNs in UnwoundPathSum. Kudos to Jakub Zakrzewski for tracking down the bug. * Add a test	2019-01-03 15:04:46 -08:00
Jiaming Yuan	55bc149efb	Fix sparse page segfault. (#4040 ) * Remove usage of raw pointers in SparsePageSource.	2019-01-03 23:40:40 +08:00
Shayak Banerjee	431c850c03	[jvm-packages] Updates to Java Booster to support other feature importance measures (#3801 ) * Updates to Booster to support other feature importances * Add returns for Java methods * Pass Scala style checks * Pass Java style checks * Fix indents * Use class instead of enum * Return map string double * A no longer broken build, thanks to mvn package local build * Add a unit test to increase code coverage back * Address code review on main code * Add more unit tests for different feature importance scores * Address more CR	2019-01-02 01:13:14 -08:00
Jiaming Yuan	1f022929f4	Use Span in gpu coordinate. (#4029 ) * Use Span in gpu coordinate. * Use Span in device code. * Fix shard size calculation. - Use lower_bound instead of upper_bound. * Check empty devices.	2019-01-02 11:32:43 +08:00
Nan Zhu	f368d0de2b	[jvm-packages] fix the scalability issue of prediction (#4033 )	2018-12-29 20:46:30 -08:00
Tatsuhito KATO	15fe2f1e7c	fix typos (#4027 )	2018-12-28 00:36:47 +08:00
Jiaming Yuan	be948df23f	Fix ignoring dart in updater configuration. (#4024 ) * Fix ignoring dart in updater configuration.	2018-12-26 18:24:45 +08:00
Jiaming Yuan	9897b5042f	Use Span in GPU exact updater. (#4020 ) * Use Span in GPU exact updater. * Add a small test.	2018-12-26 12:44:46 +08:00
Jiaming Yuan	7735252925	Document num_parallel_tree. (#4022 )	2018-12-25 22:00:58 +08:00
Jiaming Yuan	85939c6a6e	Merge duplicated linear updater parameters. (#4013 ) * Merge duplicated linear updater parameters. * Split up coordinate descent parameter.	2018-12-22 13:21:49 +08:00
Rory Mitchell	f75a21af25	Reduce tree expand boilerplate code (#4008 )	2018-12-20 15:52:28 +13:00
Rory Mitchell	84c99f86f4	Combine TreeModel and RegTree (#3995 )	2018-12-19 12:16:40 +13:00
Nan Zhu	c055a32609	[jvm-packages]support multiple validation datasets in Spark (#3910 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * wrap iterators * enable copartition training and validationset * add parameters * converge code path and have init unit test * enable multi evals for ranking * unit test and doc * update example * fix early stopping * address the offline comments * udpate doc * test eval metrics * fix compilation issue * fix example	2018-12-17 21:03:57 -08:00
Jiaming Yuan	c8c7b9649c	Fix and optimize logger (#4002 ) * Fix logging switch statement. * Remove debug_verbose_ in AllReducer. * Don't construct the stream when not needed. * Make default constructor deleted. * Remove redundant IsVerbose.	2018-12-17 19:23:05 +08:00
Sam Wilkinson	a2dc929598	Update CONTRIBUTORS.md (#3999 )	2018-12-15 18:10:52 +08:00
Andy Adinets	42bf90eb8f	Column sampling at individual nodes (splits). (#3971 ) * Column sampling at individual nodes (splits). * Documented colsample_bynode parameter. - also updated documentation for colsample_by* parameters * Updated documentation. * GetFeatureSet() returns shared pointer to std::vector. * Sync sampled columns across multiple processes.	2018-12-14 22:37:35 +08:00
Jiaming Yuan	e0a279114e	Unify logging facilities. (#3982 ) * Unify logging facilities. * Enhance `ConsoleLogger` to handle different verbosity. * Override macros from `dmlc`. * Don't use specialized gamma when building with GPU. * Remove verbosity cache in monitor. * Test monitor. * Deprecate `silent`. * Fix doc and messages. * Fix python test. * Fix silent tests.	2018-12-14 19:29:58 +08:00
Sam Wilkinson	fd722d60cd	Deprecation warning for lists passed into DMatrix (#3970 ) * Ensure lists cannot be passed into DMatrix The documentation does not include lists as an allowed type for the data inputted into DMatrix. Despite this, a list can be passed in without an error. This change would prevent a list form being passed in directly.	2018-12-14 19:26:11 +08:00
lyxthe	53f695acf2	scikit-learn api section documentation correction (#3967 ) * update description of early stopping rounds the description of early stopping round was quite inconsistent in the scikit-learn api section since the fit paragraph tells that when early stopping rounds occurs, the last iteration is returned not the best one, but the predict paragraph tells that when the predict is called without ntree_limit specified, then ntree_limit is equals to best_ntree_limit. Thus, when reading the fit part, one could think that it is needed to specify what is the best iter when calling the predict, but when reading the predict part, then the best iter is given by default, it is the last iter that you have to specify if needed. * Update sklearn.py * Update sklearn.py fix doc according to the python_lightweight_test error	2018-12-14 00:27:04 -08:00
Rory Mitchell	3d81c48d3f	Remove leaf vector, add tree serialisation test, fix Windows tests (#3989 )	2018-12-13 10:28:38 +13:00
Tong He	84a3af8dc0	Fix CRAN check warnings/notes (#3988 ) * fix * reorder declaration to match initialization	2018-12-12 08:23:20 -06:00
Andy Adinets	4be5edaf92	Initialized AllReducer counters to 0. (#3987 )	2018-12-12 09:09:20 +13:00
Rory Mitchell	93f9ce9ef9	Single precision histograms on GPU (#3965 ) * Allow single precision histogram summation in gpu_hist * Add python test, reduce run-time of gpu_hist tests * Update documentation	2018-12-10 10:55:30 +13:00
Philip Hyunsu Cho	9af6b689d6	Use int instead of char in CLI config parser (#3976 )	2018-12-07 01:00:21 -08:00
Philip Hyunsu Cho	4f26053b09	Fix typo in Feature Interaction Constraints tutorial (#3975 )	2018-12-06 19:38:40 -08:00

1 2 3 4 5 ...

3596 Commits