236 Commits

Author SHA1 Message Date
Jiaming Yuan
4bbf062ed3
[Breaking] Update sklearn interface. (#4929)
* Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909.
* Check data shape. Fix #4896.
* Check element of eval_set is tuple. Fix #4875
*  Add doc for random_state with hogwild. Fixes #4919
2019-10-12 02:50:09 -04:00
Jiaming Yuan
b8433c455a
Rewrite Dask interface. (#4819) 2019-09-25 01:30:14 -04:00
Philip Hyunsu Cho
1aaf4a679d
Fix early stopping in the Python package (#4638)
* Fix #4630, #4421: Preserve correct ordering between metrics, and always use last metric for early stopping

* Clarify semantics of early stopping in presence of multiple valid sets and metrics

* Add a test

* Fix lint
2019-07-07 01:01:03 -07:00
Philip Hyunsu Cho
4df246191f
Add warning when save_model() is called from scikit-learn interface (#4632) 2019-07-03 23:37:53 -07:00
Jiaming Yuan
2cff735126
Update doc for feature constraints and n_gpus. (#4596)
* Update doc for feature constraints. 

* Fix some warnings.

* Clean up doc for `n_gpus`.
2019-06-23 14:37:22 +08:00
Andy Adinets
9fa29ad753 Set reg_lambda=1e-5 for scikit-learn-like random forest classes. (#4558) 2019-06-22 08:02:13 +12:00
Philip Hyunsu Cho
30e1cb4e9e Fix docstring for XGBModel.predict() [skip ci] (#4592) 2019-06-21 12:44:42 +08:00
Jiaming Yuan
4591039eba
Remove remaining reg:linear. (#4544) 2019-06-11 16:04:09 +08:00
Philip Hyunsu Cho
c2a3902ba3
Fix #4497: Enable feature importance property for DART booster (#4525) 2019-05-31 15:11:57 -07:00
Philip Hyunsu Cho
bbe0dbd7ec
Migrate pylint check to Python 3 (#4381)
* Migrate lint to Python 3

* Fix lint errors

* Use Miniconda3 to use Python 3.7

* Use latest pylint and astroid
2019-04-21 01:01:54 -07:00
Jiaming Yuan
7b1b11390a
Mark Scikit-Learn RF interface as experimental in doc. (#4258)
* Mark Scikit-Learn RF interface as experimental in doc.
2019-03-16 00:45:32 +08:00
Andy Adinets
4352fcdb15 Brought the silent parameter for the SKLearn-like API back, marked it deprecated. (#4255)
* Brought the silent parameter for the SKLearn-like API back, marked it deprecated.

- added deprecation notice and warning
- removed silent from the tests for the SKLearn-like API
2019-03-14 09:45:08 +13:00
Andy Adinets
a36c3ed4f4 Added SKLearn-like random forest Python API. (#4148)
* Added SKLearn-like random forest Python API.

- added XGBRFClassifier and XGBRFRegressor classes to SKL-like xgboost API
- also added n_gpus and gpu_id parameters to SKL classes
- added documentation describing how to use xgboost for random forests,
  as well as existing caveats
2019-03-12 22:28:19 +08:00
Philip Hyunsu Cho
99a290489c
Update Python docstring for ranking functions (#4121)
* Update Python docstring for ranking functions

* Fix formatting
2019-02-10 12:22:02 -08:00
tmitanitky
59f868bc60 enable xgb_model in scklearn XGBClassifier and test. (#4092)
* Enable xgb_model parameter in XGClassifier scikit-learn API

https://github.com/dmlc/xgboost/issues/3049

* add test_XGBClassifier_resume():

test for xgb_model parameter in XGBClassifier API.

* Update test_with_sklearn.py

* Fix lint
2019-01-31 11:29:19 -08:00
Jiaming Yuan
e0a279114e
Unify logging facilities. (#3982)
* Unify logging facilities.

* Enhance `ConsoleLogger` to handle different verbosity.
* Override macros from `dmlc`.
* Don't use specialized gamma when building with GPU.
* Remove verbosity cache in monitor.
* Test monitor.
* Deprecate `silent`.
* Fix doc and messages.
* Fix python test.
* Fix silent tests.
2018-12-14 19:29:58 +08:00
lyxthe
53f695acf2 scikit-learn api section documentation correction (#3967)
* update description of early stopping rounds

the description of early stopping round was quite inconsistent in the scikit-learn api section since the fit paragraph tells that when early stopping rounds occurs, the last iteration is returned not the best one, but the predict paragraph tells that when the predict is called without ntree_limit specified, then ntree_limit is equals to best_ntree_limit.

Thus, when reading the fit part, one could think that it is needed to specify what is the best iter when calling the predict, but when reading the predict part, then the best iter is given by default, it is the last iter that you have to specify if needed.

* Update sklearn.py

* Update sklearn.py

fix doc according to the python_lightweight_test error
2018-12-14 00:27:04 -08:00
Philip Hyunsu Cho
f9302a56fb
Fix #3894: Allow loading pickles without self.booster attributes (#3938)
The addition of self.booster attribute broke backward compatibility.
2018-11-23 12:15:50 -08:00
Dr. Kashif Rasul
143475b27b use gain for sklearn feature_importances_ (#3876)
* use gain for sklearn feature_importances_

`gain` is a better feature importance criteria than the currently used `weight`

* added importance_type to class

* fixed test

* white space

* fix variable name

* fix deprecation warning

* fix exp array

* white spaces
2018-11-13 03:30:40 -08:00
Philip Hyunsu Cho
ad6e0d55f1
Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV (#3873)
* Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV

* Fix lint

* Fix lint
2018-11-08 19:41:35 -08:00
Philip Hyunsu Cho
e04ab56b57
Fix #3747: Add coef_ and intercept_ as properties of sklearn wrapper (#3855)
* Fix #3747: Add coef_ and intercept_ as properties of sklearn wrapper

Scikit-learn expects linear learners to expose `coef_` and `intercept_`
as properties.

Closes #3747.

* Fix lint
2018-11-02 01:44:37 -07:00
Rory Mitchell
42200ec03e
Allow XGBRanker sklearn interface to use other xgboost ranking objectives (#3848) 2018-11-01 13:34:25 +13:00
Philip Hyunsu Cho
d83c818000
Recommend pickling as the way to save XGBClassifier / XGBRegressor / XGBRanker (#3829)
The `save_model()` and `load_model()` method only saves the part of the model
that's common to all language interfaces and do not preserve Python-specific
attributes, such as `feature_names`. More crucially, label encoder is not
preserved either; this is needed for the scikit-learn wrapper, since you may
have string labels.

Fix: Explicitly recommend pickling as the way to save scikit-learn model
objects.
2018-10-25 11:12:41 -07:00
Rory Mitchell
5d6baed998
Allow sklearn grid search over parameters specified as kwargs (#3791) 2018-10-14 12:44:53 +13:00
Philip Hyunsu Cho
c23783a0d1
Add notes to doc (#3765) 2018-10-07 14:09:09 -07:00
mrgutkun
4b43810f51 Fix #3663: Allow sklearn API to use callbacks (#3682)
* Fix #3663: Allow sklearn API to use callbacks

* Fix lint

* Add Callback API to Python API doc
2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho
5a8bbb39a1
Revert #3677 and #3674 (#3678)
* Revert "Add scikit-learn as dependency for doc build (#3677)"

This reverts commit 308f664ade0547242608e21f6198c895415f03da.

* Revert "Add scikit-learn tests (#3674)"

This reverts commit d176a0fbc8165e3afe3e42ff464ab7b253211555.
2018-09-06 20:43:17 -07:00
Philip Hyunsu Cho
d176a0fbc8
Add scikit-learn tests (#3674)
* Add scikit-learn tests

Goal is to pass scikit-learn's check_estimator() for XGBClassifier,
XGBRegressor, and XGBRanker. It is actually not possible to do so
entirely, since check_estimator() assumes that NaN is disallowed,
but XGBoost allows for NaN as missing values. However, it is always
good ideas to add some checks inspired by check_estimator().

* Fix lint

* Fix lint
2018-09-06 09:55:28 -07:00
Philip Hyunsu Cho
86d88c0758
Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True (#3651)
* Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True

* Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True)

* Fix flaky test test_with_sklearn.test_sklearn_api_gblinear
2018-08-30 21:05:05 -07:00
Philip Hyunsu Cho
7b1427f926
Add validate_features parameter to sklearn API (#3653) 2018-08-29 23:21:46 -07:00
Philip Hyunsu Cho
4ed8a88240
Update Python API doc (#3619)
* Add XGBRanker to Python API doc

* Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel

* Add table of contents to Python API doc

* Skip JVM doc download if not available

* Show inherited members for XGBRegressor and XGBRanker

* Expose XGBRanker to Python XGBoost module directory

* Add docstring to XGBRegressor.predict() and XGBRanker.predict()

* Fix rendering errors in Python docstrings

* Fix lint
2018-08-22 18:59:30 -07:00
Shiki-H
24a268a2e3 sklearn api for ranking (#3560)
* added xgbranker

* fixed predict method and ranking test

* reformatted code in accordance with pep8

* fixed lint error

* fixed docstring and added checks on objective

* added ranking demo for python

* fixed suffix in rank.py
2018-08-21 08:26:48 -07:00
Philip Hyunsu Cho
3c72654e3b
Revert "Fix #3485, #3540: Don't use dropout for predicting test sets" (#3563)
* Revert "Fix #3485, #3540: Don't use dropout for predicting test sets (#3556)"

This reverts commit 44811f233071c5805d70c287abd22b155b732727.

* Document behavior of predict() for DART booster

* Add notice to parameter.rst
2018-08-08 09:48:55 -07:00
kodonnell
6bed54ac39 python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour (#3445)
* python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour

* Fix whitespace
2018-07-08 14:35:52 -07:00
Mike Liu
594bcea83e Save and load model in sklearn API (#3192)
* Add (load|save)_model to XGBModel

* Add docstring

* Fix docstring

* Fix mixed use of space and tab

* Add a test

* Fix Flake8 style errors
2018-06-30 19:21:49 +00:00
Yanbo Liang
b018ef104f Remove output_margin from XGBClassifier.predict_proba argument list. (#3343) 2018-05-28 10:30:21 -07:00
pdavalo
480e3fd764 Sklearn: validation set weights (#2354)
* Add option to use weights when evaluating metrics in validation sets

* Add test for validation-set weights functionality

* simplify case with no weights for test sets

* fix lint issues
2018-05-23 17:06:20 -07:00
Felipe Arruda Pontes
81d1b17f9c adding some docs based on core.Boost.predict (#1865) 2018-02-09 06:38:38 -08:00
csgwma
33ac8a0927 delete duplicated code in python-package (#2985) 2017-12-30 20:26:35 +08:00
Julian Niedermeier
9a81c74a7b Add xgb_model parameter to sklearn fit (#2623)
Adding xgb_model paramter allows the continuation of model training.
Model has to be saved by calling `model.get_booster().save_model(path)`
2017-10-01 08:47:17 -04:00
Andrew Hannigan
5c9f0ff9d9 Check existance of seed/nthread keys before checking their value. (#2669) 2017-09-27 03:05:59 -04:00
Tsukasa OMOTO
8d15024ac7 python: follow the default warning filters of Python (#2666)
* python: follow the default warning filters of Python

https://docs.python.org/3/library/warnings.html#default-warning-filters

* update tests

* update tests
2017-09-27 03:03:01 -04:00
PSEUDOTENSOR / Jonathan McKinney
0664298bb2 Update sklearn API to pass along n_jobs to DMatrix creation (#2658) 2017-08-31 15:24:59 +12:00
René Scheibe
a0c5bde024 Fix typo in sklearn documentation (#2580) 2017-08-07 19:06:11 +02:00
wxchan
65d2513714 [python-package] fix sklearn n_jobs/nthreads and seed/random_state bug (#2378)
* add a testcase causing RuntimeError

* move seed/random_state/nthread/n_jobs check to get_xgb_params()

* fix failed test
2017-06-12 09:33:42 -04:00
gaw89
0f3a404d91 Sklearn kwargs (#2338)
* Added kwargs support for Sklearn API

* Updated NEWS and CONTRIBUTORS

* Fixed CONTRIBUTORS.md

* Added clarification of **kwargs and test for proper usage

* Fixed lint error

* Fixed more lint errors and clf assigned but never used

* Fixed more lint errors

* Fixed more lint errors

* Fixed issue with changes from different branch bleeding over

* Fixed issue with changes from other branch bleeding over

* Added note that kwargs may not be compatible with Sklearn

* Fixed linting on kwargs note
2017-05-23 21:47:53 -05:00
gaw89
6cea1e3fb7 Sklearn convention update (#2323)
* Added n_jobs and random_state to keep up to date with sklearn API.
Deprecated nthread and seed.  Added tests for new params and
deprecations.

* Fixed docstring to reflect updates to n_jobs and random_state.

* Fixed whitespace issues and removed nose import.

* Added deprecation note for nthread and seed in docstring.

* Attempted fix of deprecation tests.

* Second attempted fix to tests.

* Set n_jobs to 1.
2017-05-22 08:22:05 -05:00
jayzed82
29289d2302 Add option to choose booster in scikit intreface (gbtree by default) (#2303)
* Add option to choose booster in scikit intreface (gbtree by default)

* Add option to choose booster in scikit intreface: complete docstring.

* Fix XGBClassifier to work with booster option

* Added test case for gblinear booster
2017-05-18 23:12:27 -04:00
Srivatsan Ramanujam
036ee55fe0 adding sample weights for XGBRegressor (was this forgotten?) (#1874) 2017-01-21 11:58:03 -08:00
ccphillippi
dd477ac903 Move feature_importances_ to base XGBModel for XGBRegressor access (#1591) 2016-12-01 10:17:37 -08:00