250 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
96826a3515
Release version 0.80 (#3541)
* Up versions

* Write release note for 0.80
2018-08-13 01:38:37 -07:00
Philip Hyunsu Cho
3c72654e3b
Revert "Fix #3485, #3540: Don't use dropout for predicting test sets" (#3563)
* Revert "Fix #3485, #3540: Don't use dropout for predicting test sets (#3556)"

This reverts commit 44811f233071c5805d70c287abd22b155b732727.

* Document behavior of predict() for DART booster

* Add notice to parameter.rst
2018-08-08 09:48:55 -07:00
wenduowang
3b62e75f2e Fix bug of using list(x) function when x is string (#3432)
* Fix bug of using list(x) function when x is string

list('abcdcba') = ['a', 'b', 'c', 'd', 'c', 'b', 'a']

* Allow feature_names/feature_types to be of any type

If feature_names/feature_types is iterable, e.g. tuple, list, then convert the value to list, except for string; otherwise construct a list with a single value

* Delete excess whitespace

* Fix whitespace to pass lint
2018-07-30 07:36:34 -07:00
jqmp
e9a97e0d88 Add total_gain and total_cover importance measures (#3498)
Add `'total_gain'` and `'total_cover'` as possible `importance_type`
arguments to `Booster.get_score` in the Python package.

`get_score` already accepts a `'gain'` argument, which returns each
feature's average gain over all of its splits.  `'total_gain'` does the
same, but returns a total rather than an average.  This seems more
intuitively meaningful, and also matches the behavior of the R package's
`xgb.importance` function.

I also added an analogous `'total_cover'` command for consistency.

This should resolve #3484.
2018-07-23 00:30:55 -07:00
KOLANICH
a393d44c5d Improved library loading a bit (#3481)
* Improved library loading a bit

* Fixed indentation.

* Fixes according to the discussion

* Moved the comment to a separate line.
* specified exception type
2018-07-20 16:03:44 -07:00
Philip Hyunsu Cho
8e90b60c4d
Fix relpath in setup.py on Windows (#3493)
* Fix relpath in setup.py on Windows

Fixes #3480.

* Use only one lib file; use 4 space indent
2018-07-20 12:28:08 -07:00
kodonnell
6bed54ac39 python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour (#3445)
* python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour

* Fix whitespace
2018-07-08 14:35:52 -07:00
Philip Hyunsu Cho
66e74d2223 Fix get_uint_info() (#3442)
* Add regression test
2018-07-05 20:06:59 -07:00
Philip Hyunsu Cho
48d6e68690
Add callback interface to re-direct console output (#3438)
* Add callback interface to re-direct console output

* Exempt TrackerLogger from custom logging

* Fix lint
2018-07-05 11:32:30 -07:00
Oliver Laslett
18813a26ab allow arbitrary cross validation fold indices (#3353)
* allow arbitrary cross validation fold indices

 - use training indices passed to `folds` parameter in `training.cv`
 - update doc string

* add tests for arbitrary fold indices
2018-06-30 19:23:49 +00:00
Mike Liu
594bcea83e Save and load model in sklearn API (#3192)
* Add (load|save)_model to XGBModel

* Add docstring

* Fix docstring

* Fix mixed use of space and tab

* Add a test

* Fix Flake8 style errors
2018-06-30 19:21:49 +00:00
cinqS
8bec8d5e9a Better doc for save_model() / load_model() (#3143)
Be clear that they do not save Python-specific attributes
2018-06-29 04:24:33 +00:00
PSEUDOTENSOR / Jonathan McKinney
9ac163d0bb Allow import via python datatable. (#3272)
* Allow import via python datatable.

* Write unit tests

* Refactor dt API functions

* Refactor python code

* Lint fixes

* Address review comments
2018-06-20 13:16:18 -07:00
ngoyal2707
902ecbade8 added python doc string for nthreads to dmatrix (#3363) 2018-06-08 14:16:30 +12:00
Philip Hyunsu Cho
1214081f99
Release version 0.72 (#3337) 2018-06-01 16:00:31 -07:00
Kristian Gampong
a510e68dda Add validate_features option for Booster predict (#3323)
* Add validate_features option for Booster predict

* Fix trailing whitespace in docstring
2018-05-29 11:40:49 -07:00
Yanbo Liang
b018ef104f Remove output_margin from XGBClassifier.predict_proba argument list. (#3343) 2018-05-28 10:30:21 -07:00
pdavalo
480e3fd764 Sklearn: validation set weights (#2354)
* Add option to use weights when evaluating metrics in validation sets

* Add test for validation-set weights functionality

* simplify case with no weights for test sets

* fix lint issues
2018-05-23 17:06:20 -07:00
mallniya
039dbe6aec freebsd support in libpath.py (#3247) 2018-05-09 16:13:30 -07:00
Rory Mitchell
a185ddfe03
Implement GPU accelerated coordinate descent algorithm (#3178)
* Implement GPU accelerated coordinate descent algorithm. 

* Exclude external memory tests for GPU
2018-04-20 14:56:35 +12:00
Philip Hyunsu Cho
230cb9b787
Release version 0.71 (#3200) 2018-04-11 21:43:32 +09:00
Philip Hyunsu Cho
017acf54d9
Fix up make pippack command for building source package for PyPI (#3199)
* Now `make pippack` works without any manual action: it will produce
  xgboost-[version].tar.gz, which one can use by typing
  `pip3 install xgboost-[version].tar.gz`.
* Detect OpenMP-capable compilers (clang, gcc-5, gcc-7) on MacOS
2018-03-28 10:32:52 -07:00
Andrea Bergonzo
8937134015 Update build_trouble_shooting.md (#3144) 2018-03-02 16:23:45 -08:00
Philip Hyunsu Cho
32ea70c1c9
Documenting CSV loading into DMatrix (#3137)
* Support CSV file in DMatrix

We'd just need to expose the CSV parser in dmlc-core to the Python wrapper

* Revert extra code; document existing CSV support

CSV support is already there but undocumented

* Add notice about categorical features
2018-02-28 18:41:10 -08:00
Oleg Panichev
cf19caa46a Fix for ZeroDivisionError when verbose_eval equals to 0. (#3115) 2018-02-15 17:58:06 -06:00
Felipe Arruda Pontes
81d1b17f9c adding some docs based on core.Boost.predict (#1865) 2018-02-09 06:38:38 -08:00
Scott Lundberg
d878c36c84 Add SHAP interaction effects, fix minor bug, and add cox loss (#3043)
* Add interaction effects and cox loss

* Minimize whitespace changes

* Cox loss now no longer needs a pre-sorted dataset.

* Address code review comments

* Remove mem check, rename to pred_interactions, include bias

* Make lint happy

* More lint fixes

* Fix cox loss indexing

* Fix main effects and tests

* Fix lint

* Use half interaction values on the off-diagonals

* Fix lint again
2018-02-07 20:38:01 -06:00
Zhirui Wang
bf43671841 update macOS gcc@5 installation guide (#3003)
After installing ``gcc@5``, ``CMAKE_C_COMPILER`` will not be set to gcc-5 in some macOS environment automatically and the installation of xgboost will still fail. Manually setting the compiler will solve the problem.
2018-01-04 11:28:26 -08:00
Philip Cho
4aa346c10b
Update PyPI maintainer; use VERSION for binary wheels (#2992) 2017-12-31 12:03:08 +09:00
csgwma
33ac8a0927 delete duplicated code in python-package (#2985) 2017-12-30 20:26:35 +08:00
Philip Cho
8d35c09c55 Tag version 0.7 (#2975)
* Tag version 0.7

* Document all changes made in year 2016
2017-12-30 20:16:41 +08:00
Yuchao Dai
eedca8c8ec fix the typo in core.py (#2978) 2017-12-25 21:08:27 -08:00
jac-stripe
1e3aabbadc Include symlinks to make wheel build work (#2909) 2017-12-01 11:27:58 -05:00
Jerry Dumblauskas
5867c1b96d update doc string for grid parameter (#2647)
* update doc string for grid parameter

* update doc string for grid parameter
2017-11-29 11:22:46 -08:00
Rajiv Abraham
77715d5c62 Update to correct brew gcc command (#1931)
The previous command did not work for me. This one did.
2017-11-29 11:20:49 -08:00
Sam O
602b34ab91 Fix performance of c_array in python core.py (#2786) 2017-11-29 11:12:49 -08:00
Joe Nyland
88177691b8 Update README (#2204)
I found the installation of the Python XGBoost package to be problematic as the documentation around compiler requirements was unclear, as discussed in #1501. I decided that I would improve the README.
2017-11-19 17:12:16 -08:00
Rory Mitchell
16c63f30d0
Fix MultiIndex detection (breaks for latest pandas==0.21.0). (#2872) 2017-11-11 11:12:23 +13:00
caoyi
3610025fb6 Fix typo (#2818)
Fix typo
2017-10-23 10:45:49 -05:00
Scott Lundberg
78c4188cec SHAP values for feature contributions (#2438)
* SHAP values for feature contributions

* Fix commenting error

* New polynomial time SHAP value estimation algorithm

* Update API to support SHAP values

* Fix merge conflicts with updates in master

* Correct submodule hashes

* Fix variable sized stack allocation

* Make lint happy

* Add docs

* Fix typo

* Adjust tolerances

* Remove unneeded def

* Fixed cpp test setup

* Updated R API and cleaned up

* Fixed test typo
2017-10-12 12:35:51 -07:00
Julian Niedermeier
9a81c74a7b Add xgb_model parameter to sklearn fit (#2623)
Adding xgb_model paramter allows the continuation of model training.
Model has to be saved by calling `model.get_booster().save_model(path)`
2017-10-01 08:47:17 -04:00
Andrew Hannigan
5c9f0ff9d9 Check existance of seed/nthread keys before checking their value. (#2669) 2017-09-27 03:05:59 -04:00
Philip Cho
31ad40b963 Make __del__ method idempotent (#2627)
Addresses Issue #2533.
2017-09-27 03:03:55 -04:00
Tsukasa OMOTO
8d15024ac7 python: follow the default warning filters of Python (#2666)
* python: follow the default warning filters of Python

https://docs.python.org/3/library/warnings.html#default-warning-filters

* update tests

* update tests
2017-09-27 03:03:01 -04:00
Icyblade Dai
0e85b30fdd Fix issue 2670 (#2671)
* fix issue 2670

* add python<3.6 compatibility

* fix Index

* fix Index/MultiIndex

* fix lint

* fix W0622

really nonsense

* fix lambda

* Trigger Travis

* add test for MultiIndex

* remove tailing whitespace
2017-09-19 15:49:41 -04:00
SimonAB
2e9d06443e Add show_values option to feature importances plot (#2351)
Adding an option to remove the values from the features importances plot in Python.
2017-08-31 12:26:54 -05:00
PSEUDOTENSOR / Jonathan McKinney
0664298bb2 Update sklearn API to pass along n_jobs to DMatrix creation (#2658) 2017-08-31 15:24:59 +12:00
René Scheibe
a0c5bde024 Fix typo in sklearn documentation (#2580) 2017-08-07 19:06:11 +02:00
Vadim Khotilovich
2b3a4318c5 Several fixes (#2572)
* repared serialization after update process; fixes #2545

* non-stratified folds in python could omit some data instances

* Makefile: fixes for older makes on windows; clean R-package too

* make cub to be a shallow submodule

* improve $(MAKE) recovery
2017-08-06 13:03:50 -05:00
PSEUDOTENSOR / Jonathan McKinney
6b375f6ad8 Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530)
* Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.
2017-07-21 14:43:17 +12:00