3560 Commits

Author SHA1 Message Date
trivialfis
4b892c2b30 Remove obsoleted QuantileHistMaker. (#3761)
Fix #3755.
2018-10-06 11:22:15 -07:00
Nan Zhu
785094db53
[jvm-packages] fix issue when spark job execution thread cannot return before we execute first() (#3758)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* sparjJobThread

* update

* fix issue when spark job execution thread cannot return before we execute first()
2018-10-05 22:20:50 -07:00
zengxy
9e73087324 [jvm-packages] support specified feature names when getModelDump and getFeatureScore (#3733)
* [jvm-packages] support specified feature names for jvm when get ModelDump and get FeatureScore (#3725)

* typo and style fix
2018-10-04 09:05:42 -07:00
Rory Mitchell
34522d56f0
Allow plug-ins to be built by cmake (#3752)
* Remove references to AVX code.

* Allow plugins to be built by cmake
2018-10-04 22:03:52 +13:00
trivialfis
c6b5df67f6 Catch dmlc::Error. (#3751)
Fix #3643.
2018-10-04 16:51:38 +13:00
weitian
efc4f85505 [jvm-packages] Fix #3489: Spark repartitionForData can potentially shuffle all data and lose ordering required for ranking objectives (#3654) 2018-10-03 08:43:55 -07:00
trivialfis
d594b11f35 Implement transform to reduce CPU/GPU code duplication. (#3643)
* Implement Transform class.
* Add tests for softmax.
* Use Transform in regression, softmax and hinge objectives, except for Cox.
* Mark old gpu objective functions deprecated.
* static_assert for softmax.
* Split up multi-gpu tests.
2018-10-02 15:06:21 +13:00
Sergei Lebedev
87aca8c244 [jvm-packages] Fixed the distributed updater check (#3739)
The updater used in distributed training is grow_histmaker and not 
grow_colmaker as the error message stated prior to this commit.
2018-10-01 11:22:01 -07:00
Rory Mitchell
70d208d68c
Dmatrix refactor stage 2 (#3395)
* DMatrix refactor 2

* Remove buffered rowset usage where possible

* Transition to c++11 style iterators for row access

* Transition column iterators to C++ 11
2018-10-01 01:29:03 +13:00
Philip Hyunsu Cho
b50bc2c1d4
Add multi-GPU unit test environment (#3741)
* Add multi-GPU unit test environment

* Better assertion message

* Temporarily disable failing test

* Distinguish between multi-GPU and single-GPU CPP tests

* Consolidate Python tests. Use attributes to distinguish multi-GPU Python tests from single-CPU counterparts
2018-09-29 11:20:58 -07:00
Philip Hyunsu Cho
baef5741df
Separate out restricted and unrestricted tasks (#3736) 2018-09-27 23:06:14 -07:00
trivialfis
5a7f7e7d49 Implement devices to devices reshard. (#3721)
* Force clearing device memory before Reshard.
* Remove calculating row_segments for gpu_hist and gpu_sketch.
* Guard against changing device.
2018-09-28 17:40:23 +12:00
Tong He
0b7fd74138 fix R check warning (#3728) 2018-09-27 17:53:49 -07:00
Philip Hyunsu Cho
51478a39c9
Fix #3730: scikit-learn 0.20 compatibility fix (#3731)
* Fix #3730: scikit-learn 0.20 compatibility fix

sklearn.cross_validation has been removed from scikit-learn 0.20,
so replace it with sklearn.model_selection

* Display test names for Python tests for clarity
2018-09-27 15:03:05 -07:00
Philip Hyunsu Cho
fbe9d41dd0 Disable flaky tests in R-package/tests/testthat/test_update.R (#3723) 2018-09-26 14:21:41 -07:00
Nan Zhu
79d854c695
[jvm-packages] fix errors in example (#3719)
* add back train method but mark as deprecated

* fix scalastyle error

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* instrumentation

* use log console

* better measurement

* fix erros in example

* update histmaker
2018-09-22 16:39:38 -07:00
BruceZhao
3b5a1f389a [R] add a demo of multi-class classification R version (#3695)
* add a demo of multi-class classification R version

* add a demo of multi-class classification result

* add intro to the demo readme

* Delete train.md

* Update README.md
2018-09-21 23:06:40 -07:00
Takahiro Kojima
2405c59352 remove extra of (#3713) 2018-09-21 11:55:39 -07:00
Philip Hyunsu Cho
73140ce84c
Fix #3702: do not round up integer thresholds for integer features in JSON dump (#3717) 2018-09-21 01:11:21 -07:00
Nan Zhu
aa53e9fc8d
[jvm-packages] bump spark version (#3709) 2018-09-19 11:18:01 -07:00
trivialfis
9119f9e369 Fix gpu devices. (#3693)
* Fix gpu_set normalized and unnormalized.
* Fix DeviceSpan.
2018-09-19 17:39:42 +12:00
Andy Adinets
0f99cdfe0e Fixed an uninitialized pointer. (#3703) 2018-09-16 18:02:31 +12:00
Michael Mui
20a9e716bd [jvm-packages] Fix "obj_type" error to enable custom objectives and evaluations (#3646)
credits to @mmui
2018-09-14 12:06:33 -07:00
Dmitriy Rybalko
7bbb44182a update eval_metric doc (#3687) 2018-09-14 08:47:05 -07:00
Jerry Lin
9acd549dc7 [jvm-packages] Add rank:ndcg and rank:map to Spark supported objectives (#3697) 2018-09-13 09:51:24 -07:00
Chen Qin
42b108136f [jvm-packages] bump flink version number (#3686)
* bump flink version number

* bump flink version number

* add missing hadoop dependency
2018-09-13 09:33:09 -07:00
Philip Hyunsu Cho
bd41bd6605
Better error message for failed library loading (#3690)
* Better error message for failed lib loading

* Address review comment + fix lint
2018-09-12 22:37:26 -07:00
Philip Hyunsu Cho
3209b42b07
Include full text of Apache 2.0 license (#3698) 2018-09-12 20:46:55 -07:00
jakehoare
7707982a85 Amend xgb.createFolds to handle classes of a single element. (#3630)
* Amend xgb.createFolds to handle classes of a single element.

* Fix variable name
2018-09-12 09:23:05 -05:00
Vadim Khotilovich
ad3a0bbab8
Add the missing max_delta_step (#3668)
* add max_delta_step to SplitEvaluator

* test for max_delta_step

* missing x2 factor for L1 term

* remove gamma from ElasticNet
2018-09-12 08:43:41 -05:00
Nan Zhu
d1e75d615e
[jvm-packages] Remove copy paste error in test suite (#3692)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* remove copy paste error
2018-09-11 13:08:36 -07:00
Joseph Bradley
14a8b96476 [jvm-packages] xgboost-spark warning when Spark encryption is turned on (#3667)
* added test, commented out right now

* reinstated test

* added fix for checking encryption settings

* fix by using RDD conf

* fix compilation

* renamed conf

* use SparkSession if available

* fix message

* nop

* code review fixes
2018-09-10 14:21:01 -07:00
Philip Hyunsu Cho
3564b68b98
Fix #3397: early_stop callback does not maximize metric of form NDCG@n- (#3685)
* Fix #3397: early_stop callback does not maximize metric of form NDCG@n-

Early stopping callback makes splits with '-' letter, which interferes
with metrics of form NDCG@n-. As a result, XGBoost tries to minimize
NDCG@n-, where it should be maximized instead.

Fix. Specify maxsplit=1.

* Python 2.x compatibility fix
2018-09-08 19:46:25 -07:00
Andy Adinets
f606cb8ef4 Fixed the performance regression within EvaluateSplits(). (#3680)
- it turns out creating an std::vector on every call is faster
  than cudaMallocHost()/cudaFreeHost()
2018-09-08 14:48:45 +12:00
Matthew Tovbin
beab6e08dd Remove println in jsonDecode (#3665)
Following issue  #3578
2018-09-07 15:47:26 -07:00
mrgutkun
4b43810f51 Fix #3663: Allow sklearn API to use callbacks (#3682)
* Fix #3663: Allow sklearn API to use callbacks

* Fix lint

* Add Callback API to Python API doc
2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho
5a8bbb39a1
Revert #3677 and #3674 (#3678)
* Revert "Add scikit-learn as dependency for doc build (#3677)"

This reverts commit 308f664ade0547242608e21f6198c895415f03da.

* Revert "Add scikit-learn tests (#3674)"

This reverts commit d176a0fbc8165e3afe3e42ff464ab7b253211555.
2018-09-06 20:43:17 -07:00
Sergei Chipiga
8dac0d1009 Fix typo in python demo (#3676) 2018-09-06 14:56:21 -07:00
Philip Hyunsu Cho
308f664ade
Add scikit-learn as dependency for doc build (#3677) 2018-09-06 14:56:05 -07:00
Philip Hyunsu Cho
56e906a789
Update dmlc-core, to fix partitioned file loading (#3673) 2018-09-06 09:56:06 -07:00
Philip Hyunsu Cho
d176a0fbc8
Add scikit-learn tests (#3674)
* Add scikit-learn tests

Goal is to pass scikit-learn's check_estimator() for XGBClassifier,
XGBRegressor, and XGBRanker. It is actually not possible to do so
entirely, since check_estimator() assumes that NaN is disallowed,
but XGBoost allows for NaN as missing values. However, it is always
good ideas to add some checks inspired by check_estimator().

* Fix lint

* Fix lint
2018-09-06 09:55:28 -07:00
Philip Hyunsu Cho
190d888695
Document LambdaMART objectives: pairwise, listwise (#3672)
* Document LambdaMART objectives

* Distinguish between pairwise and listwise objectives
2018-09-06 09:54:37 -07:00
Philip Hyunsu Cho
c87153ed32
Fix CRAN check by removing reference to std::cerr (#3660)
* Fix CRAN check by removing reference to std::cerr

* Mask tests that fail on 32-bit Windows R
2018-09-05 11:44:00 -07:00
Philip Hyunsu Cho
9344f081a4
Add numpy and matplotlib as requirements for doc build (#3669) 2018-09-04 20:56:18 -07:00
Shiki-H
8f4acba34b moved data processing to wgetdata.sh (#3666) 2018-09-04 09:36:48 -07:00
Andrew Thia
9254c58e4d [TREE] add interaction constraints (#3466)
* add interaction constraints

* enable both interaction and monotonic constraints at the same time

* fix lint

* add R test, fix lint, update demo

* Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping

* Add Python test for interaction constraints

* make R interaction constraints parameter based on feature index instead of column names, fix R coding style

* Fix lint

* Add BlueTea88 to CONTRIBUTORS.md

* Short circuit when no constraint is specified; address review comments

* Add tutorial for feature interaction constraints

* allow interaction constraints to be passed as string, remove redundant column_names argument

* Fix typo

* Address review comments

* Add comments to Python test
2018-09-04 09:35:39 -07:00
Andy Adinets
dee0b69674 Fixed copy constructor for HostDeviceVectorImpl. (#3657)
- previously, vec_ in DeviceShard wasn't updated on copy; as a result,
  the shards continued to refer to the old HostDeviceVectorImpl object,
  which resulted in a dangling pointer once that object was deallocated
2018-09-01 11:38:09 +12:00
Philip Hyunsu Cho
86d88c0758
Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True (#3651)
* Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True

* Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True)

* Fix flaky test test_with_sklearn.test_sklearn_api_gblinear
2018-08-30 21:05:05 -07:00
Vadim Khotilovich
5b662cbe1c
[R] R-interface for SHAP interactions (#3636)
* add R-interface for SHAP interactions

* update docs for new roxygen version
2018-08-30 19:06:21 -05:00
Philip Hyunsu Cho
10c31ab2cb
Fix #3638: Binary classification demo should produce LIBSVM with 0-based indexing (#3652) 2018-08-30 13:18:42 -07:00