xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	3564b68b98	Fix #3397 : early_stop callback does not maximize metric of form NDCG@n- (#3685 ) * Fix #3397: early_stop callback does not maximize metric of form NDCG@n- Early stopping callback makes splits with '-' letter, which interferes with metrics of form NDCG@n-. As a result, XGBoost tries to minimize NDCG@n-, where it should be maximized instead. Fix. Specify maxsplit=1. * Python 2.x compatibility fix	2018-09-08 19:46:25 -07:00
Andy Adinets	f606cb8ef4	Fixed the performance regression within EvaluateSplits(). (#3680 ) - it turns out creating an std::vector on every call is faster than cudaMallocHost()/cudaFreeHost()	2018-09-08 14:48:45 +12:00
Matthew Tovbin	beab6e08dd	Remove println in jsonDecode (#3665 ) Following issue #3578	2018-09-07 15:47:26 -07:00
mrgutkun	4b43810f51	Fix #3663 : Allow sklearn API to use callbacks (#3682 ) * Fix #3663: Allow sklearn API to use callbacks * Fix lint * Add Callback API to Python API doc	2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho	5a8bbb39a1	Revert #3677 and #3674 (#3678 ) * Revert "Add scikit-learn as dependency for doc build (#3677)" This reverts commit 308f664ade0547242608e21f6198c895415f03da. * Revert "Add scikit-learn tests (#3674)" This reverts commit d176a0fbc8165e3afe3e42ff464ab7b253211555.	2018-09-06 20:43:17 -07:00
Sergei Chipiga	8dac0d1009	Fix typo in python demo (#3676 )	2018-09-06 14:56:21 -07:00
Philip Hyunsu Cho	308f664ade	Add scikit-learn as dependency for doc build (#3677 )	2018-09-06 14:56:05 -07:00
Philip Hyunsu Cho	56e906a789	Update dmlc-core, to fix partitioned file loading (#3673 )	2018-09-06 09:56:06 -07:00
Philip Hyunsu Cho	d176a0fbc8	Add scikit-learn tests (#3674 ) * Add scikit-learn tests Goal is to pass scikit-learn's check_estimator() for XGBClassifier, XGBRegressor, and XGBRanker. It is actually not possible to do so entirely, since check_estimator() assumes that NaN is disallowed, but XGBoost allows for NaN as missing values. However, it is always good ideas to add some checks inspired by check_estimator(). * Fix lint * Fix lint	2018-09-06 09:55:28 -07:00
Philip Hyunsu Cho	190d888695	Document LambdaMART objectives: pairwise, listwise (#3672 ) * Document LambdaMART objectives * Distinguish between pairwise and listwise objectives	2018-09-06 09:54:37 -07:00
Philip Hyunsu Cho	c87153ed32	Fix CRAN check by removing reference to std::cerr (#3660 ) * Fix CRAN check by removing reference to std::cerr * Mask tests that fail on 32-bit Windows R	2018-09-05 11:44:00 -07:00
Philip Hyunsu Cho	9344f081a4	Add numpy and matplotlib as requirements for doc build (#3669 )	2018-09-04 20:56:18 -07:00
Shiki-H	8f4acba34b	moved data processing to wgetdata.sh (#3666 )	2018-09-04 09:36:48 -07:00
Andrew Thia	9254c58e4d	[TREE] add interaction constraints (#3466 ) * add interaction constraints * enable both interaction and monotonic constraints at the same time * fix lint * add R test, fix lint, update demo * Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping * Add Python test for interaction constraints * make R interaction constraints parameter based on feature index instead of column names, fix R coding style * Fix lint * Add BlueTea88 to CONTRIBUTORS.md * Short circuit when no constraint is specified; address review comments * Add tutorial for feature interaction constraints * allow interaction constraints to be passed as string, remove redundant column_names argument * Fix typo * Address review comments * Add comments to Python test	2018-09-04 09:35:39 -07:00
Andy Adinets	dee0b69674	Fixed copy constructor for HostDeviceVectorImpl. (#3657 ) - previously, vec_ in DeviceShard wasn't updated on copy; as a result, the shards continued to refer to the old HostDeviceVectorImpl object, which resulted in a dangling pointer once that object was deallocated	2018-09-01 11:38:09 +12:00
Philip Hyunsu Cho	86d88c0758	Fix #3648 : XGBClassifier.predict() should return margin scores when output_margin=True (#3651 ) * Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True * Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True) * Fix flaky test test_with_sklearn.test_sklearn_api_gblinear	2018-08-30 21:05:05 -07:00
Vadim Khotilovich	5b662cbe1c	[R] R-interface for SHAP interactions (#3636 ) * add R-interface for SHAP interactions * update docs for new roxygen version	2018-08-30 19:06:21 -05:00
Philip Hyunsu Cho	10c31ab2cb	Fix #3638 : Binary classification demo should produce LIBSVM with 0-based indexing (#3652 )	2018-08-30 13:18:42 -07:00
Philip Hyunsu Cho	7b1427f926	Add validate_features parameter to sklearn API (#3653 )	2018-08-29 23:21:46 -07:00
Andy Adinets	72cd1517d6	Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. (#3446 ) * Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. - added distributions to HostDeviceVector - using HostDeviceVector for labels, weights and base margings in MetaInfo - using HostDeviceVector for offset and data in SparsePage - other necessary refactoring * Added const version of HostDeviceVector API calls. - const versions added to calls that can trigger data transfers, e.g. DevicePointer() - updated the code that uses HostDeviceVector - objective functions now accept const HostDeviceVector<bst_float>& for predictions * Updated src/linear/updater_gpu_coordinate.cu. * Added read-only state for HostDeviceVector sync. - this means no copies are performed if both host and devices access the HostDeviceVector read-only * Fixed linter and test errors. - updated the lz4 plugin - added ConstDeviceSpan to HostDeviceVector - using device % dh::NVisibleDevices() for the physical device number, e.g. in calls to cudaSetDevice() * Fixed explicit template instantiation errors for HostDeviceVector. - replaced HostDeviceVector<unsigned int> with HostDeviceVector<int> * Fixed HostDeviceVector tests that require multiple GPUs. - added a mock set device handler; when set, it is called instead of cudaSetDevice()	2018-08-30 14:28:47 +12:00
Andy Adinets	58d783df16	Fixed issue 3605. (#3628 ) * Fixed issue 3605. - https://github.com/dmlc/xgboost/issues/3605 * Fixed the bug in a better way. * Added a test to catch the bug. * Fixed linter errors.	2018-08-28 10:50:52 -07:00
Rory Mitchell	78bea0d204	Add google test for a column sampling, restore metainfo tests (#3637 ) * Add google test for a column sampling, restore metainfo tests * Update metainfo test for visual studio * Fix multi-GPU bug introduced in #3635	2018-08-28 16:10:26 +12:00
gorogm	7ef2b599c7	Link fixed. (#3640 )	2018-08-27 20:25:50 -07:00
Rory Mitchell	686e990ffc	GPU memory usage fixes + column sampling refactor (#3635 ) * Remove thrust copy calls * Fix histogram memory usage * Cap extreme histogram memory usage * More efficient column sampling * Use column sampler across updaters * More efficient split evaluation on GPU with column sampling	2018-08-27 16:26:46 +12:00
trivialfis	60787ecebc	Merge generic device helper functions into gpu set. (#3626 ) * Remove the use of old NDevices* functions. * Use GPUSet in timer.h.	2018-08-26 18:14:23 +12:00
Nan Zhu	3261002099	[jvm-packages] throw ControlThrowable instead of InterruptedException (#3632 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * interrupted exception is not rethrown	2018-08-25 20:30:21 -07:00
Philip Hyunsu Cho	cb4de521c1	Document CUDA requirement, lack of external memory on GPU (#3624 ) * Document fact that GPU doesn't support external memory * Document CUDA requirement	2018-08-22 22:47:10 -07:00
Philip Hyunsu Cho	4ed8a88240	Update Python API doc (#3619 ) * Add XGBRanker to Python API doc * Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel * Add table of contents to Python API doc * Skip JVM doc download if not available * Show inherited members for XGBRegressor and XGBRanker * Expose XGBRanker to Python XGBoost module directory * Add docstring to XGBRegressor.predict() and XGBRanker.predict() * Fix rendering errors in Python docstrings * Fix lint	2018-08-22 18:59:30 -07:00
Nan Zhu	4912c1f9c6	[jvm-packages] fix checkpoint save/load (#3614 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix update checkpoint func	2018-08-21 12:34:24 -07:00
Grant W Schneider	57f3c2f252	Remove errant $ (#3618 )	2018-08-21 12:32:38 -07:00
Shiki-H	24a268a2e3	sklearn api for ranking (#3560 ) * added xgbranker * fixed predict method and ranking test * reformatted code in accordance with pep8 * fixed lint error * fixed docstring and added checks on objective * added ranking demo for python * fixed suffix in rank.py	2018-08-21 08:26:48 -07:00
Philip Hyunsu Cho	b13c3a8bcc	Fix #3609 : Removed unused parameter 'use_buffer' (#3610 )	2018-08-21 07:54:15 -07:00
trivialfis	cf2d86a4f6	Add travis sanitizers tests. (#3557 ) * Add travis sanitizers tests. * Add gcc-7 in Travis. * Add SANITIZER_PATH for CMake. * Enable sanitizer tests in Travis. * Fix memory leaks in tests. * Fix all memory leaks reported by Address Sanitizer. * tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.	2018-08-19 16:40:30 +12:00
Philip Hyunsu Cho	983cb0b374	Add option to disable default metric (#3606 )	2018-08-18 11:39:20 -07:00
Grace Lam	993e62b9e7	Add JSON model dump functionality (#3603 ) * Add JSON model dump functionality * Fix lint	2018-08-17 16:18:43 -07:00
Matthew Tovbin	b53a5a262c	[jvm-packages] getTreeLimit return type should be Int	2018-08-17 09:36:00 -07:00
Philip Hyunsu Cho	ac7fc1306b	Fix #3598 : document that custom objective can't contain colon (:) (#3601 )	2018-08-16 19:05:40 -07:00
Grace Lam	caf4a756bf	Add JSON dump functionality documentation (#3600 )	2018-08-16 16:32:04 -07:00
trivialfis	7c82dc92b2	Fix accessing DMatrix.handle before set. (#3599 ) Close #3597.	2018-08-16 15:26:06 -07:00
Jakob Richter	725f4c36f2	replace nround with nrounds to match actual parameter (#3592 )	2018-08-15 11:13:53 -07:00
Nan Zhu	73bd590a1d	[jvm-packages] add the missing scm urls (#3589 ) for some reason this part was missing in master branch????	2018-08-14 15:05:23 -07:00
trivialfis	9265964ee7	Fix ptrdiff_t namespace in Span. (#3588 ) Fix #3587.	2018-08-15 10:04:55 +12:00
trivialfis	2c502784ff	Span class. (#3548 ) * Add basic Span class based on ISO++20. * Use Span<Entry const> instead of Inst in SparsePage. * Add DeviceSpan in HostDeviceVector, use it in regression obj.	2018-08-14 17:58:11 +12:00
Matthew Tovbin	2b7a1c5780	[jvm-packages] Avoid loosing precision when computing probabilities by converting to Double early (#3576 )	2018-08-13 14:05:07 -07:00
Matthew Tovbin	ce0f0568a6	Make sure 'thresholds' are considered when executing predict method (#3577 )	2018-08-13 14:04:47 -07:00
Philip Hyunsu Cho	6288f6d563	Update JVM packages version to 0.81-SNAPSHOT (#3584 )	2018-08-13 10:17:52 -07:00
Philip Hyunsu Cho	96826a3515	Release version 0.80 (#3541 ) * Up versions * Write release note for 0.80 v0.80	2018-08-13 01:38:37 -07:00
Mathew	06ef4db4cc	Fix Spark 2.2 Support (Amending #3062 ) (#3325 ) This pull request amends the broken #3062 allow Spark 2.2 to work. Please note this won't work in Spark <=2.1 as sc.removeSparkListener was implemented in Spark 2.2. (So perhaps a more general method is better, although that is what was attempted in #3062) This PR fixes: #3208, #3151 and the discussion in #1927. I do find it strange that #3062 dose not work in Spark 2.2, it's probably due to some sort of public/private issue in the org.apache.spark.scheduler.LiveListenerBus class inheritance (In Spark itself). The error is: `java.lang.NoSuchMethodError: org.apache.spark.scheduler.LiveListenerBus.removeListener(Ljava/lang/Object;)V`	2018-08-12 18:35:20 -07:00
Rory Mitchell	645996b12f	Remove accidental SparsePage copies (#3583 )	2018-08-12 17:49:38 -07:00
Philip Hyunsu Cho	0b607fb884	Add link to XGBoost4J-Spark tutorial on AWS Yarn tutorial (#3582 )	2018-08-12 07:27:28 -07:00

1 2 3 4 5 ...

3428 Commits