811 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
bd41bd6605
Better error message for failed library loading (#3690)
* Better error message for failed lib loading

* Address review comment + fix lint
2018-09-12 22:37:26 -07:00
Philip Hyunsu Cho
3564b68b98
Fix #3397: early_stop callback does not maximize metric of form NDCG@n- (#3685)
* Fix #3397: early_stop callback does not maximize metric of form NDCG@n-

Early stopping callback makes splits with '-' letter, which interferes
with metrics of form NDCG@n-. As a result, XGBoost tries to minimize
NDCG@n-, where it should be maximized instead.

Fix. Specify maxsplit=1.

* Python 2.x compatibility fix
2018-09-08 19:46:25 -07:00
mrgutkun
4b43810f51 Fix #3663: Allow sklearn API to use callbacks (#3682)
* Fix #3663: Allow sklearn API to use callbacks

* Fix lint

* Add Callback API to Python API doc
2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho
5a8bbb39a1
Revert #3677 and #3674 (#3678)
* Revert "Add scikit-learn as dependency for doc build (#3677)"

This reverts commit 308f664ade0547242608e21f6198c895415f03da.

* Revert "Add scikit-learn tests (#3674)"

This reverts commit d176a0fbc8165e3afe3e42ff464ab7b253211555.
2018-09-06 20:43:17 -07:00
Philip Hyunsu Cho
d176a0fbc8
Add scikit-learn tests (#3674)
* Add scikit-learn tests

Goal is to pass scikit-learn's check_estimator() for XGBClassifier,
XGBRegressor, and XGBRanker. It is actually not possible to do so
entirely, since check_estimator() assumes that NaN is disallowed,
but XGBoost allows for NaN as missing values. However, it is always
good ideas to add some checks inspired by check_estimator().

* Fix lint

* Fix lint
2018-09-06 09:55:28 -07:00
Philip Hyunsu Cho
86d88c0758
Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True (#3651)
* Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True

* Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True)

* Fix flaky test test_with_sklearn.test_sklearn_api_gblinear
2018-08-30 21:05:05 -07:00
Philip Hyunsu Cho
7b1427f926
Add validate_features parameter to sklearn API (#3653) 2018-08-29 23:21:46 -07:00
Philip Hyunsu Cho
4ed8a88240
Update Python API doc (#3619)
* Add XGBRanker to Python API doc

* Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel

* Add table of contents to Python API doc

* Skip JVM doc download if not available

* Show inherited members for XGBRegressor and XGBRanker

* Expose XGBRanker to Python XGBoost module directory

* Add docstring to XGBRegressor.predict() and XGBRanker.predict()

* Fix rendering errors in Python docstrings

* Fix lint
2018-08-22 18:59:30 -07:00
Shiki-H
24a268a2e3 sklearn api for ranking (#3560)
* added xgbranker

* fixed predict method and ranking test

* reformatted code in accordance with pep8

* fixed lint error

* fixed docstring and added checks on objective

* added ranking demo for python

* fixed suffix in rank.py
2018-08-21 08:26:48 -07:00
Grace Lam
993e62b9e7 Add JSON model dump functionality (#3603)
* Add JSON model dump functionality

* Fix lint
2018-08-17 16:18:43 -07:00
trivialfis
7c82dc92b2 Fix accessing DMatrix.handle before set. (#3599)
Close #3597.
2018-08-16 15:26:06 -07:00
Philip Hyunsu Cho
96826a3515
Release version 0.80 (#3541)
* Up versions

* Write release note for 0.80
2018-08-13 01:38:37 -07:00
Philip Hyunsu Cho
3c72654e3b
Revert "Fix #3485, #3540: Don't use dropout for predicting test sets" (#3563)
* Revert "Fix #3485, #3540: Don't use dropout for predicting test sets (#3556)"

This reverts commit 44811f233071c5805d70c287abd22b155b732727.

* Document behavior of predict() for DART booster

* Add notice to parameter.rst
2018-08-08 09:48:55 -07:00
wenduowang
3b62e75f2e Fix bug of using list(x) function when x is string (#3432)
* Fix bug of using list(x) function when x is string

list('abcdcba') = ['a', 'b', 'c', 'd', 'c', 'b', 'a']

* Allow feature_names/feature_types to be of any type

If feature_names/feature_types is iterable, e.g. tuple, list, then convert the value to list, except for string; otherwise construct a list with a single value

* Delete excess whitespace

* Fix whitespace to pass lint
2018-07-30 07:36:34 -07:00
jqmp
e9a97e0d88 Add total_gain and total_cover importance measures (#3498)
Add `'total_gain'` and `'total_cover'` as possible `importance_type`
arguments to `Booster.get_score` in the Python package.

`get_score` already accepts a `'gain'` argument, which returns each
feature's average gain over all of its splits.  `'total_gain'` does the
same, but returns a total rather than an average.  This seems more
intuitively meaningful, and also matches the behavior of the R package's
`xgb.importance` function.

I also added an analogous `'total_cover'` command for consistency.

This should resolve #3484.
2018-07-23 00:30:55 -07:00
KOLANICH
a393d44c5d Improved library loading a bit (#3481)
* Improved library loading a bit

* Fixed indentation.

* Fixes according to the discussion

* Moved the comment to a separate line.
* specified exception type
2018-07-20 16:03:44 -07:00
Philip Hyunsu Cho
8e90b60c4d
Fix relpath in setup.py on Windows (#3493)
* Fix relpath in setup.py on Windows

Fixes #3480.

* Use only one lib file; use 4 space indent
2018-07-20 12:28:08 -07:00
kodonnell
6bed54ac39 python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour (#3445)
* python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour

* Fix whitespace
2018-07-08 14:35:52 -07:00
Philip Hyunsu Cho
66e74d2223 Fix get_uint_info() (#3442)
* Add regression test
2018-07-05 20:06:59 -07:00
Philip Hyunsu Cho
48d6e68690
Add callback interface to re-direct console output (#3438)
* Add callback interface to re-direct console output

* Exempt TrackerLogger from custom logging

* Fix lint
2018-07-05 11:32:30 -07:00
Oliver Laslett
18813a26ab allow arbitrary cross validation fold indices (#3353)
* allow arbitrary cross validation fold indices

 - use training indices passed to `folds` parameter in `training.cv`
 - update doc string

* add tests for arbitrary fold indices
2018-06-30 19:23:49 +00:00
Mike Liu
594bcea83e Save and load model in sklearn API (#3192)
* Add (load|save)_model to XGBModel

* Add docstring

* Fix docstring

* Fix mixed use of space and tab

* Add a test

* Fix Flake8 style errors
2018-06-30 19:21:49 +00:00
cinqS
8bec8d5e9a Better doc for save_model() / load_model() (#3143)
Be clear that they do not save Python-specific attributes
2018-06-29 04:24:33 +00:00
PSEUDOTENSOR / Jonathan McKinney
9ac163d0bb Allow import via python datatable. (#3272)
* Allow import via python datatable.

* Write unit tests

* Refactor dt API functions

* Refactor python code

* Lint fixes

* Address review comments
2018-06-20 13:16:18 -07:00
ngoyal2707
902ecbade8 added python doc string for nthreads to dmatrix (#3363) 2018-06-08 14:16:30 +12:00
Philip Hyunsu Cho
1214081f99
Release version 0.72 (#3337) 2018-06-01 16:00:31 -07:00
Kristian Gampong
a510e68dda Add validate_features option for Booster predict (#3323)
* Add validate_features option for Booster predict

* Fix trailing whitespace in docstring
2018-05-29 11:40:49 -07:00
Yanbo Liang
b018ef104f Remove output_margin from XGBClassifier.predict_proba argument list. (#3343) 2018-05-28 10:30:21 -07:00
pdavalo
480e3fd764 Sklearn: validation set weights (#2354)
* Add option to use weights when evaluating metrics in validation sets

* Add test for validation-set weights functionality

* simplify case with no weights for test sets

* fix lint issues
2018-05-23 17:06:20 -07:00
mallniya
039dbe6aec freebsd support in libpath.py (#3247) 2018-05-09 16:13:30 -07:00
Rory Mitchell
a185ddfe03
Implement GPU accelerated coordinate descent algorithm (#3178)
* Implement GPU accelerated coordinate descent algorithm. 

* Exclude external memory tests for GPU
2018-04-20 14:56:35 +12:00
Philip Hyunsu Cho
230cb9b787
Release version 0.71 (#3200) 2018-04-11 21:43:32 +09:00
Philip Hyunsu Cho
017acf54d9
Fix up make pippack command for building source package for PyPI (#3199)
* Now `make pippack` works without any manual action: it will produce
  xgboost-[version].tar.gz, which one can use by typing
  `pip3 install xgboost-[version].tar.gz`.
* Detect OpenMP-capable compilers (clang, gcc-5, gcc-7) on MacOS
2018-03-28 10:32:52 -07:00
Andrea Bergonzo
8937134015 Update build_trouble_shooting.md (#3144) 2018-03-02 16:23:45 -08:00
Philip Hyunsu Cho
32ea70c1c9
Documenting CSV loading into DMatrix (#3137)
* Support CSV file in DMatrix

We'd just need to expose the CSV parser in dmlc-core to the Python wrapper

* Revert extra code; document existing CSV support

CSV support is already there but undocumented

* Add notice about categorical features
2018-02-28 18:41:10 -08:00
Oleg Panichev
cf19caa46a Fix for ZeroDivisionError when verbose_eval equals to 0. (#3115) 2018-02-15 17:58:06 -06:00
Felipe Arruda Pontes
81d1b17f9c adding some docs based on core.Boost.predict (#1865) 2018-02-09 06:38:38 -08:00
Scott Lundberg
d878c36c84 Add SHAP interaction effects, fix minor bug, and add cox loss (#3043)
* Add interaction effects and cox loss

* Minimize whitespace changes

* Cox loss now no longer needs a pre-sorted dataset.

* Address code review comments

* Remove mem check, rename to pred_interactions, include bias

* Make lint happy

* More lint fixes

* Fix cox loss indexing

* Fix main effects and tests

* Fix lint

* Use half interaction values on the off-diagonals

* Fix lint again
2018-02-07 20:38:01 -06:00
Zhirui Wang
bf43671841 update macOS gcc@5 installation guide (#3003)
After installing ``gcc@5``, ``CMAKE_C_COMPILER`` will not be set to gcc-5 in some macOS environment automatically and the installation of xgboost will still fail. Manually setting the compiler will solve the problem.
2018-01-04 11:28:26 -08:00
Philip Cho
4aa346c10b
Update PyPI maintainer; use VERSION for binary wheels (#2992) 2017-12-31 12:03:08 +09:00
csgwma
33ac8a0927 delete duplicated code in python-package (#2985) 2017-12-30 20:26:35 +08:00
Philip Cho
8d35c09c55 Tag version 0.7 (#2975)
* Tag version 0.7

* Document all changes made in year 2016
2017-12-30 20:16:41 +08:00
Yuchao Dai
eedca8c8ec fix the typo in core.py (#2978) 2017-12-25 21:08:27 -08:00
jac-stripe
1e3aabbadc Include symlinks to make wheel build work (#2909) 2017-12-01 11:27:58 -05:00
Jerry Dumblauskas
5867c1b96d update doc string for grid parameter (#2647)
* update doc string for grid parameter

* update doc string for grid parameter
2017-11-29 11:22:46 -08:00
Rajiv Abraham
77715d5c62 Update to correct brew gcc command (#1931)
The previous command did not work for me. This one did.
2017-11-29 11:20:49 -08:00
Sam O
602b34ab91 Fix performance of c_array in python core.py (#2786) 2017-11-29 11:12:49 -08:00
Joe Nyland
88177691b8 Update README (#2204)
I found the installation of the Python XGBoost package to be problematic as the documentation around compiler requirements was unclear, as discussed in #1501. I decided that I would improve the README.
2017-11-19 17:12:16 -08:00
Rory Mitchell
16c63f30d0
Fix MultiIndex detection (breaks for latest pandas==0.21.0). (#2872) 2017-11-11 11:12:23 +13:00
caoyi
3610025fb6 Fix typo (#2818)
Fix typo
2017-10-23 10:45:49 -05:00