xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	3a150742c7	Update dmlc-core submodule (#3907 )	2018-11-15 18:50:49 -08:00
theycallhimavi	0a0d4239d3	Fix Typo in learner.cc (#3902 )	2018-11-16 12:54:36 +13:00
Jiaming Yuan	fe999bf968	Add back python2 tests for Travis light weight tests. (#3901 )	2018-11-15 22:17:35 +13:00
Jiaming Yuan	2ea0f887c1	Refactor Python tests. (#3897 ) * Deprecate nose tests. * Format python tests.	2018-11-15 13:56:33 +13:00
Philip Hyunsu Cho	c76d993681	Enforce naming style in Python lint (#3896 )	2018-11-14 10:35:25 -08:00
Philip Hyunsu Cho	a2a8954659	Update dmlc-core submodule (#3891 )	2018-11-14 01:51:27 -08:00
Rory Mitchell	7af0946ac1	Improve update position function for gpu_hist (#3895 )	2018-11-14 19:33:29 +13:00
Dr. Kashif Rasul	143475b27b	use gain for sklearn feature_importances_ (#3876 ) * use gain for sklearn feature_importances_ `gain` is a better feature importance criteria than the currently used `weight` * added importance_type to class * fixed test * white space * fix variable name * fix deprecation warning * fix exp array * white spaces	2018-11-13 03:30:40 -08:00
Rory Mitchell	926eb651fe	Minor refactor of split evaluation in gpu_hist (#3889 ) * Refactor evaluate split into shard * Use span in evaluate split * Update google tests	2018-11-14 00:11:20 +13:00
Jiaming Yuan	daf77ca7b7	Enable running objectives with 0 GPU. (#3878 ) * Enable running objectives with 0 GPU. * Enable 0 GPU for objectives. * Add doc for GPU objectives. * Fix some objectives defaulted to running on all GPUs.	2018-11-13 20:19:59 +13:00
Jiaming Yuan	97984f4890	Fix gpu coordinate running on multi-gpu. (#3893 )	2018-11-13 19:09:55 +13:00
ajing	0ddb8a7661	Update README.md (#3872 ) SparkWithDataFrame was not there anymore. So replace with SparkMLlibPipeline.scala	2018-11-12 11:03:13 -08:00
Jiacheng Xu	d810e6dec9	Fix a typo in the R-package documentation: max.deph -> max.depth (#3890 ) Signed-off-by: Jiacheng Xu <xjcmaxwellcjx@gmail.com>	2018-11-12 01:43:23 -08:00
Philip Hyunsu Cho	be0bb7dd90	Remove unnecessary warning when 'gblinear' is selected (#3888 )	2018-11-09 12:30:38 -08:00
Philip Hyunsu Cho	e38d5a6831	Document current limitation in number of features (#3886 )	2018-11-09 00:32:43 -08:00
Philip Hyunsu Cho	828d75714d	Fix #3857 : take down AWS YARN tutorial, as it is outdated (#3885 )	2018-11-08 23:08:32 -08:00
Philip Hyunsu Cho	ad6e0d55f1	Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV (#3873 ) * Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV * Fix lint * Fix lint	2018-11-08 19:41:35 -08:00
Jiaming Yuan	19ee0a3579	Refactor fast-hist, add tests for some updaters. (#3836 ) Add unittest for prune. Add unittest for refresh. Refactor fast_hist. * Remove fast_hist_param. * Rename to quantile_hist. Add unittests for QuantileHist. * Refactor QuantileHist into .h and .cc file. * Remove sync.h. * Remove MGPU_mock test. Rename fast hist method to quantile hist.	2018-11-07 21:15:07 +13:00
Philip Hyunsu Cho	2b045aa805	Make C++ unit tests run and pass on Windows (#3869 ) * Make C++ unit tests run and pass on Windows * Fix logic for external memory. The letter ':' is part of drive letter, so remove the drive letter before splitting on ':'. * Cosmetic syntax changes to keep MSVC happy. * Fix lint * Add Windows guard	2018-11-06 17:17:24 -08:00
Jelle Zijlstra	d9642cf757	handle $PATH not being set in python library (#3845 ) Fixes #3844	2018-11-06 15:27:02 -08:00
Nikita Titov	1bf4083dc6	open README with utf-8 and add gcc-8 (#3867 )	2018-11-06 14:53:33 -08:00
Philip Hyunsu Cho	20d5abf919	Disallow std::regex since it's not supported by GCC 4.8.x (#3870 )	2018-11-05 22:57:04 -08:00
Jiaming Yuan	f1275f52c1	Fix specifying gpu_id, add tests. (#3851 ) * Rewrite gpu_id related code. * Remove normalised/unnormalised operatios. * Address difference between `Index' and `Device ID'. * Modify doc for `gpu_id'. * Better LOG for GPUSet. * Check specified n_gpus. * Remove inappropriate `device_idx' term. * Clarify GpuIdType and size_t.	2018-11-06 18:17:53 +13:00
Jiaming Yuan	1698fe64bb	Document GPU objectives in NEWS. (#3865 )	2018-11-05 14:46:45 +13:00
Philip Hyunsu Cho	91cc14ea70	Add another contributor for rabit update	2018-11-04 10:29:21 -08:00
Philip Hyunsu Cho	78ec77fa97	Release 0.81 version (#3864 ) * Release 0.81 version * Update NEWS.md v0.81	2018-11-04 05:49:11 -08:00
Philip Hyunsu Cho	c22e90d5d2	Correct typo	2018-11-04 05:22:53 -08:00
Philip Hyunsu Cho	6da462234e	Move MinGW-w64 + Python section to the end, since it's 'advanced' (#3863 )	2018-11-04 05:12:27 -08:00
Philip Hyunsu Cho	a650131fc3	Update doc: colsample_bylevel now works for tree_method=hist (#3862 ) This feature was introduced by #3635	2018-11-04 02:25:25 -08:00
Philip Hyunsu Cho	91537e7353	Fix #3342 and h2oai/h2o4gpu#625 : Save predictor parameters in model file (#3856 ) * Fix #3342 and h2oai/h2o4gpu#625: Save predictor parameters in model file This allows pickled models to retain predictor attributes, such as 'predictor' (whether to use CPU or GPU) and 'n_gpu' (number of GPUs to use). Related: h2oai/h2o4gpu#625 Closes #3342. TODO. Write a test. * Fix lint * Do not load GPU predictor into CPU-only XGBoost * Add a test for pickling GPU predictors * Make sample data big enough to pass multi GPU test * Update test_gpu_predictor.cu	2018-11-03 21:45:38 -07:00
Philip Hyunsu Cho	e04ab56b57	Fix #3747 : Add coef_ and intercept_ as properties of sklearn wrapper (#3855 ) * Fix #3747: Add coef_ and intercept_ as properties of sklearn wrapper Scikit-learn expects linear learners to expose `coef_` and `intercept_` as properties. Closes #3747. * Fix lint	2018-11-02 01:44:37 -07:00
Philip Hyunsu Cho	ad68865d6b	[Blocking] Fix #3840 : Clean up logic for parsing tree_method parameter (#3849 ) * Clean up logic for converting tree_method to updater sequence * Use C++11 enum class for extra safety Compiler will give warnings if switch statements don't handle all possible values of C++11 enum class. Also allow enum class to be used as DMLC parameter. * Fix compiler error + lint * Address reviewer comment * Better docstring for DECLARE_FIELD_ENUM_CLASS * Fix lint * Add C++ test to see if tree_method is recognized * Fix clang-tidy error * Add test_learner.h to R package * Update comments * Fix lint error	2018-11-01 19:33:35 -07:00
Philip Hyunsu Cho	583c88bce7	[jvm-packages] Require vanilla Apache Spark (#3854 )	2018-11-01 19:15:40 -07:00
Philip Hyunsu Cho	2febc105a4	[jvm-packages] Fix JVM doc build (#3853 ) To get around of the bug https://issues.apache.org/jira/browse/SUREFIRE-1588, set useSystemClassLoader=false.	2018-11-01 15:16:08 -07:00
Jonathan Friedman	45d321da28	Fix typo in docs (#3852 ) Fix typo in docs	2018-11-01 13:03:59 -07:00
Philip Hyunsu Cho	411df9f878	Test wheels on CUDA 10.0 container for compatibility (#3838 )	2018-11-01 08:34:47 -07:00
Rory Mitchell	42200ec03e	Allow XGBRanker sklearn interface to use other xgboost ranking objectives (#3848 )	2018-11-01 13:34:25 +13:00
Chen Qin	87f49995be	update rabit (#3835 )	2018-10-30 09:15:19 -07:00
Zhao Hang	e3c1afac6b	Update parameter.rst (#3843 )	2018-10-31 00:19:45 +13:00
Matthew Tovbin	d81fedb955	[jvm-packages] RabitTracker for Scala: allow specifying host ip from the xgboost-tracker.properties file (#3833 )	2018-10-26 22:01:36 -07:00
Nan Zhu	5fbe230636	[jvm-packages] documenting tracker (#3831 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * documenting tracker * Make it a separate note	2018-10-25 18:53:46 -07:00
Philip Hyunsu Cho	d83c818000	Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker (#3829 ) The `save_model()` and `load_model()` method only saves the part of the model that's common to all language interfaces and do not preserve Python-specific attributes, such as `feature_names`. More crucially, label encoder is not preserved either; this is needed for the scikit-learn wrapper, since you may have string labels. Fix: Explicitly recommend pickling as the way to save scikit-learn model objects.	2018-10-25 11:12:41 -07:00
Andy Adinets	2a59ff2f9b	Multi-GPU support in GPUPredictor. (#3738 ) * Multi-GPU support in GPUPredictor. - GPUPredictor is multi-GPU - removed DeviceMatrix, as it has been made obsolete by using HostDeviceVector in DMatrix * Replaced pointers with spans in GPUPredictor. * Added a multi-GPU predictor test. * Fix multi-gpu test. * Fix n_rows < n_gpus. * Reinitialize shards when GPUSet is changed. * Tests range of data. * Remove commented code. * Remove commented code.	2018-10-23 22:59:11 -07:00
Bruno Tremblay	32de54fdee	Update R-package/R/xgb.ggplot.R (#3820 ) Changed width parameter of var important ggplot from 0.05 to 0.5 to make it more visible when displaying more variables.	2018-10-23 20:52:33 -07:00
Philip Hyunsu Cho	02130af47d	Enable auto-locking of issues closed long ago (#3821 ) * Enable auto-locking of issues closed long ago Issues that were closed more than 90 days ago will be locked automatically so that no additional comments would be allowed. We will use a bot to do this: https://probot.github.io/apps/lock/ Background: As a maintainer, I often see people leaving comments to old issue posts that were closed long ago. Those comments are hard to discover and assist with, since they get buried under list of other active issues. With the change, users who want to follow up with an old issue would be asked to file a new issue. * Exempt `feature-request` from auto locking * Disable comment to avoid triggering notification	2018-10-23 19:21:58 -07:00
Nan Zhu	4ae225a08d	[Blocking][jvm-packages] fix the early stopping feature (#3808 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * temp * add method for classifier and regressor * update tutorial * address the comments * update	2018-10-23 14:53:13 -07:00
Philip Hyunsu Cho	e26b5d63b2	[jvm-packages] Upgrade Scala to 2.11.12 to address CVE-2017-15288 (#3816 ) A privilege escalation vulnerability (CVE-2017-15288) has been identified in the Scala compilation daemon. See https://nvd.nist.gov/vuln/detail/CVE-2017-15288 Fix: Upgrade Scala to 2.11.12.	2018-10-22 10:15:30 -07:00
Philip Hyunsu Cho	abf2f661be	Fix #3708 : Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way (#3783 ) * Fix #3708: Use dmlc::TemporaryDirectory to handle temporaries in cross-platform way Also install git inside NVIDIA GPU container * Update dmlc-core	2018-10-18 10:16:04 -07:00
Philip Hyunsu Cho	55ee9a92a1	Fix Python environment for distributed unit tests (#3806 )	2018-10-18 00:12:02 -07:00
Philip Hyunsu Cho	b38c636d05	Fix #3523 : Fix CustomGlobalRandomEngine for R (#3781 ) Symptom Apple Clang's implementation of `std::shuffle` expects doesn't work correctly when it is run with the random bit generator for R package: ```cpp CustomGlobalRandomEngine::result_type CustomGlobalRandomEngine::operator()() { return static_cast<result_type>( std::floor(unif_rand() * CustomGlobalRandomEngine::max())); } ``` Minimial reproduction of failure (compile using Apple Clang 10.0): ```cpp std::vector<int> feature_set(100); std::iota(feature_set.begin(), feature_set.end(), 0); // initialize with 0, 1, 2, 3, ..., 99 std::shuffle(feature_set.begin(), feature_set.end(), common::GlobalRandom()); // This returns 0, 1, 2, ..., 99, so content didn't get shuffled at all!!! ``` Note that this bug is platform-dependent; it does not appear when GCC or upstream LLVM Clang is used. Diagnosis Apple Clang's `std::shuffle` expects 32-bit integer inputs, whereas `CustomGlobalRandomEngine::operator()` produces 64-bit integers. Fix Have `CustomGlobalRandomEngine::operator()` produce 32-bit integers. Closes #3523.	2018-10-15 09:39:13 -07:00

1 2 3 4 5 ...

3580 Commits