3592 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
73140ce84c
Fix #3702: do not round up integer thresholds for integer features in JSON dump (#3717) 2018-09-21 01:11:21 -07:00
Nan Zhu
aa53e9fc8d
[jvm-packages] bump spark version (#3709) 2018-09-19 11:18:01 -07:00
trivialfis
9119f9e369 Fix gpu devices. (#3693)
* Fix gpu_set normalized and unnormalized.
* Fix DeviceSpan.
2018-09-19 17:39:42 +12:00
Andy Adinets
0f99cdfe0e Fixed an uninitialized pointer. (#3703) 2018-09-16 18:02:31 +12:00
Michael Mui
20a9e716bd [jvm-packages] Fix "obj_type" error to enable custom objectives and evaluations (#3646)
credits to @mmui
2018-09-14 12:06:33 -07:00
Dmitriy Rybalko
7bbb44182a update eval_metric doc (#3687) 2018-09-14 08:47:05 -07:00
Jerry Lin
9acd549dc7 [jvm-packages] Add rank:ndcg and rank:map to Spark supported objectives (#3697) 2018-09-13 09:51:24 -07:00
Chen Qin
42b108136f [jvm-packages] bump flink version number (#3686)
* bump flink version number

* bump flink version number

* add missing hadoop dependency
2018-09-13 09:33:09 -07:00
Philip Hyunsu Cho
bd41bd6605
Better error message for failed library loading (#3690)
* Better error message for failed lib loading

* Address review comment + fix lint
2018-09-12 22:37:26 -07:00
Philip Hyunsu Cho
3209b42b07
Include full text of Apache 2.0 license (#3698) 2018-09-12 20:46:55 -07:00
jakehoare
7707982a85 Amend xgb.createFolds to handle classes of a single element. (#3630)
* Amend xgb.createFolds to handle classes of a single element.

* Fix variable name
2018-09-12 09:23:05 -05:00
Vadim Khotilovich
ad3a0bbab8
Add the missing max_delta_step (#3668)
* add max_delta_step to SplitEvaluator

* test for max_delta_step

* missing x2 factor for L1 term

* remove gamma from ElasticNet
2018-09-12 08:43:41 -05:00
Nan Zhu
d1e75d615e
[jvm-packages] Remove copy paste error in test suite (#3692)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* remove copy paste error
2018-09-11 13:08:36 -07:00
Joseph Bradley
14a8b96476 [jvm-packages] xgboost-spark warning when Spark encryption is turned on (#3667)
* added test, commented out right now

* reinstated test

* added fix for checking encryption settings

* fix by using RDD conf

* fix compilation

* renamed conf

* use SparkSession if available

* fix message

* nop

* code review fixes
2018-09-10 14:21:01 -07:00
Philip Hyunsu Cho
3564b68b98
Fix #3397: early_stop callback does not maximize metric of form NDCG@n- (#3685)
* Fix #3397: early_stop callback does not maximize metric of form NDCG@n-

Early stopping callback makes splits with '-' letter, which interferes
with metrics of form NDCG@n-. As a result, XGBoost tries to minimize
NDCG@n-, where it should be maximized instead.

Fix. Specify maxsplit=1.

* Python 2.x compatibility fix
2018-09-08 19:46:25 -07:00
Andy Adinets
f606cb8ef4 Fixed the performance regression within EvaluateSplits(). (#3680)
- it turns out creating an std::vector on every call is faster
  than cudaMallocHost()/cudaFreeHost()
2018-09-08 14:48:45 +12:00
Matthew Tovbin
beab6e08dd Remove println in jsonDecode (#3665)
Following issue  #3578
2018-09-07 15:47:26 -07:00
mrgutkun
4b43810f51 Fix #3663: Allow sklearn API to use callbacks (#3682)
* Fix #3663: Allow sklearn API to use callbacks

* Fix lint

* Add Callback API to Python API doc
2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho
5a8bbb39a1
Revert #3677 and #3674 (#3678)
* Revert "Add scikit-learn as dependency for doc build (#3677)"

This reverts commit 308f664ade0547242608e21f6198c895415f03da.

* Revert "Add scikit-learn tests (#3674)"

This reverts commit d176a0fbc8165e3afe3e42ff464ab7b253211555.
2018-09-06 20:43:17 -07:00
Sergei Chipiga
8dac0d1009 Fix typo in python demo (#3676) 2018-09-06 14:56:21 -07:00
Philip Hyunsu Cho
308f664ade
Add scikit-learn as dependency for doc build (#3677) 2018-09-06 14:56:05 -07:00
Philip Hyunsu Cho
56e906a789
Update dmlc-core, to fix partitioned file loading (#3673) 2018-09-06 09:56:06 -07:00
Philip Hyunsu Cho
d176a0fbc8
Add scikit-learn tests (#3674)
* Add scikit-learn tests

Goal is to pass scikit-learn's check_estimator() for XGBClassifier,
XGBRegressor, and XGBRanker. It is actually not possible to do so
entirely, since check_estimator() assumes that NaN is disallowed,
but XGBoost allows for NaN as missing values. However, it is always
good ideas to add some checks inspired by check_estimator().

* Fix lint

* Fix lint
2018-09-06 09:55:28 -07:00
Philip Hyunsu Cho
190d888695
Document LambdaMART objectives: pairwise, listwise (#3672)
* Document LambdaMART objectives

* Distinguish between pairwise and listwise objectives
2018-09-06 09:54:37 -07:00
Philip Hyunsu Cho
c87153ed32
Fix CRAN check by removing reference to std::cerr (#3660)
* Fix CRAN check by removing reference to std::cerr

* Mask tests that fail on 32-bit Windows R
2018-09-05 11:44:00 -07:00
Philip Hyunsu Cho
9344f081a4
Add numpy and matplotlib as requirements for doc build (#3669) 2018-09-04 20:56:18 -07:00
Shiki-H
8f4acba34b moved data processing to wgetdata.sh (#3666) 2018-09-04 09:36:48 -07:00
Andrew Thia
9254c58e4d [TREE] add interaction constraints (#3466)
* add interaction constraints

* enable both interaction and monotonic constraints at the same time

* fix lint

* add R test, fix lint, update demo

* Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping

* Add Python test for interaction constraints

* make R interaction constraints parameter based on feature index instead of column names, fix R coding style

* Fix lint

* Add BlueTea88 to CONTRIBUTORS.md

* Short circuit when no constraint is specified; address review comments

* Add tutorial for feature interaction constraints

* allow interaction constraints to be passed as string, remove redundant column_names argument

* Fix typo

* Address review comments

* Add comments to Python test
2018-09-04 09:35:39 -07:00
Andy Adinets
dee0b69674 Fixed copy constructor for HostDeviceVectorImpl. (#3657)
- previously, vec_ in DeviceShard wasn't updated on copy; as a result,
  the shards continued to refer to the old HostDeviceVectorImpl object,
  which resulted in a dangling pointer once that object was deallocated
2018-09-01 11:38:09 +12:00
Philip Hyunsu Cho
86d88c0758
Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True (#3651)
* Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True

* Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True)

* Fix flaky test test_with_sklearn.test_sklearn_api_gblinear
2018-08-30 21:05:05 -07:00
Vadim Khotilovich
5b662cbe1c
[R] R-interface for SHAP interactions (#3636)
* add R-interface for SHAP interactions

* update docs for new roxygen version
2018-08-30 19:06:21 -05:00
Philip Hyunsu Cho
10c31ab2cb
Fix #3638: Binary classification demo should produce LIBSVM with 0-based indexing (#3652) 2018-08-30 13:18:42 -07:00
Philip Hyunsu Cho
7b1427f926
Add validate_features parameter to sklearn API (#3653) 2018-08-29 23:21:46 -07:00
Andy Adinets
72cd1517d6 Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. (#3446)
* Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage.

- added distributions to HostDeviceVector
- using HostDeviceVector for labels, weights and base margings in MetaInfo
- using HostDeviceVector for offset and data in SparsePage
- other necessary refactoring

* Added const version of HostDeviceVector API calls.

- const versions added to calls that can trigger data transfers, e.g. DevicePointer()
- updated the code that uses HostDeviceVector
- objective functions now accept const HostDeviceVector<bst_float>& for predictions

* Updated src/linear/updater_gpu_coordinate.cu.

* Added read-only state for HostDeviceVector sync.

- this means no copies are performed if both host and devices access
  the HostDeviceVector read-only

* Fixed linter and test errors.

- updated the lz4 plugin
- added ConstDeviceSpan to HostDeviceVector
- using device % dh::NVisibleDevices() for the physical device number,
  e.g. in calls to cudaSetDevice()

* Fixed explicit template instantiation errors for HostDeviceVector.

- replaced HostDeviceVector<unsigned int> with HostDeviceVector<int>

* Fixed HostDeviceVector tests that require multiple GPUs.

- added a mock set device handler; when set, it is called instead of cudaSetDevice()
2018-08-30 14:28:47 +12:00
Andy Adinets
58d783df16 Fixed issue 3605. (#3628)
* Fixed issue 3605.

- https://github.com/dmlc/xgboost/issues/3605

* Fixed the bug in a better way.

* Added a test to catch the bug.

* Fixed linter errors.
2018-08-28 10:50:52 -07:00
Rory Mitchell
78bea0d204
Add google test for a column sampling, restore metainfo tests (#3637)
* Add google test for a column sampling, restore metainfo tests

* Update metainfo test for visual studio

* Fix multi-GPU bug introduced in #3635
2018-08-28 16:10:26 +12:00
gorogm
7ef2b599c7 Link fixed. (#3640) 2018-08-27 20:25:50 -07:00
Rory Mitchell
686e990ffc
GPU memory usage fixes + column sampling refactor (#3635)
* Remove thrust copy calls

* Fix  histogram memory usage

* Cap extreme histogram memory usage

* More efficient column sampling

* Use column sampler across updaters

* More efficient split evaluation on GPU with column sampling
2018-08-27 16:26:46 +12:00
trivialfis
60787ecebc Merge generic device helper functions into gpu set. (#3626)
* Remove the use of old NDevices* functions.
* Use GPUSet in timer.h.
2018-08-26 18:14:23 +12:00
Nan Zhu
3261002099
[jvm-packages] throw ControlThrowable instead of InterruptedException (#3632)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* interrupted exception is not rethrown
2018-08-25 20:30:21 -07:00
Philip Hyunsu Cho
cb4de521c1
Document CUDA requirement, lack of external memory on GPU (#3624)
* Document fact that GPU doesn't support external memory

* Document CUDA requirement
2018-08-22 22:47:10 -07:00
Philip Hyunsu Cho
4ed8a88240
Update Python API doc (#3619)
* Add XGBRanker to Python API doc

* Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel

* Add table of contents to Python API doc

* Skip JVM doc download if not available

* Show inherited members for XGBRegressor and XGBRanker

* Expose XGBRanker to Python XGBoost module directory

* Add docstring to XGBRegressor.predict() and XGBRanker.predict()

* Fix rendering errors in Python docstrings

* Fix lint
2018-08-22 18:59:30 -07:00
Nan Zhu
4912c1f9c6
[jvm-packages] fix checkpoint save/load (#3614)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix update checkpoint func
2018-08-21 12:34:24 -07:00
Grant W Schneider
57f3c2f252 Remove errant $ (#3618) 2018-08-21 12:32:38 -07:00
Shiki-H
24a268a2e3 sklearn api for ranking (#3560)
* added xgbranker

* fixed predict method and ranking test

* reformatted code in accordance with pep8

* fixed lint error

* fixed docstring and added checks on objective

* added ranking demo for python

* fixed suffix in rank.py
2018-08-21 08:26:48 -07:00
Philip Hyunsu Cho
b13c3a8bcc
Fix #3609: Removed unused parameter 'use_buffer' (#3610) 2018-08-21 07:54:15 -07:00
trivialfis
cf2d86a4f6 Add travis sanitizers tests. (#3557)
* Add travis sanitizers tests.

* Add gcc-7 in Travis.
* Add SANITIZER_PATH for CMake.
* Enable sanitizer tests in Travis.

* Fix memory leaks in tests.

* Fix all memory leaks reported by Address Sanitizer.
* tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.
2018-08-19 16:40:30 +12:00
Philip Hyunsu Cho
983cb0b374
Add option to disable default metric (#3606) 2018-08-18 11:39:20 -07:00
Grace Lam
993e62b9e7 Add JSON model dump functionality (#3603)
* Add JSON model dump functionality

* Fix lint
2018-08-17 16:18:43 -07:00
Matthew Tovbin
b53a5a262c [jvm-packages] getTreeLimit return type should be Int 2018-08-17 09:36:00 -07:00