xgboost

Author	SHA1	Message	Date
Rory Mitchell	16c63f30d0	Fix MultiIndex detection (breaks for latest pandas==0.21.0). (#2872 )	2017-11-11 11:12:23 +13:00
caoyi	3610025fb6	Fix typo (#2818 ) Fix typo	2017-10-23 10:45:49 -05:00
Scott Lundberg	78c4188cec	SHAP values for feature contributions (#2438 ) * SHAP values for feature contributions * Fix commenting error * New polynomial time SHAP value estimation algorithm * Update API to support SHAP values * Fix merge conflicts with updates in master * Correct submodule hashes * Fix variable sized stack allocation * Make lint happy * Add docs * Fix typo * Adjust tolerances * Remove unneeded def * Fixed cpp test setup * Updated R API and cleaned up * Fixed test typo	2017-10-12 12:35:51 -07:00
Julian Niedermeier	9a81c74a7b	Add xgb_model parameter to sklearn fit (#2623 ) Adding xgb_model paramter allows the continuation of model training. Model has to be saved by calling `model.get_booster().save_model(path)`	2017-10-01 08:47:17 -04:00
Andrew Hannigan	5c9f0ff9d9	Check existance of seed/nthread keys before checking their value. (#2669 )	2017-09-27 03:05:59 -04:00
Philip Cho	31ad40b963	Make __del__ method idempotent (#2627 ) Addresses Issue #2533.	2017-09-27 03:03:55 -04:00
Tsukasa OMOTO	8d15024ac7	python: follow the default warning filters of Python (#2666 ) * python: follow the default warning filters of Python https://docs.python.org/3/library/warnings.html#default-warning-filters * update tests * update tests	2017-09-27 03:03:01 -04:00
Icyblade Dai	0e85b30fdd	Fix issue 2670 (#2671 ) * fix issue 2670 * add python<3.6 compatibility * fix Index * fix Index/MultiIndex * fix lint * fix W0622 really nonsense * fix lambda * Trigger Travis * add test for MultiIndex * remove tailing whitespace	2017-09-19 15:49:41 -04:00
SimonAB	2e9d06443e	Add show_values option to feature importances plot (#2351 ) Adding an option to remove the values from the features importances plot in Python.	2017-08-31 12:26:54 -05:00
PSEUDOTENSOR / Jonathan McKinney	0664298bb2	Update sklearn API to pass along n_jobs to DMatrix creation (#2658 )	2017-08-31 15:24:59 +12:00
René Scheibe	a0c5bde024	Fix typo in sklearn documentation (#2580 )	2017-08-07 19:06:11 +02:00
Vadim Khotilovich	2b3a4318c5	Several fixes (#2572 ) * repared serialization after update process; fixes #2545 * non-stratified folds in python could omit some data instances * Makefile: fixes for older makes on windows; clean R-package too * make cub to be a shallow submodule * improve $(MAKE) recovery	2017-08-06 13:03:50 -05:00
PSEUDOTENSOR / Jonathan McKinney	6b375f6ad8	Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530 ) * Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.	2017-07-21 14:43:17 +12:00
Rory Mitchell	56550ff3f1	Fix pylint (#2537 )	2017-07-21 11:41:56 +12:00
Sergei Lebedev	88488fdbb9	Fixed shared library loading in the Python package (#2461 ) * Fixed DLL name on Windows in ``xgboost.libpath`` * Added support for OS X to ``xgboost.libpath`` * Use .dylib for shared library on OS X This does not affect the JNI library, because it is not trully cross-platform in the Makefile-build anyway.	2017-06-29 11:50:50 +12:00
Alfredo Cambera	46b9889cc5	Update build_trouble_shooting.md (#2430 ) I had to fight with my linux box for a day to find the solution to the problem. I hope than this may help other users to save some time.	2017-06-20 21:36:10 -07:00
wxchan	65d2513714	[python-package] fix sklearn n_jobs/nthreads and seed/random_state bug (#2378 ) * add a testcase causing RuntimeError * move seed/random_state/nthread/n_jobs check to get_xgb_params() * fix failed test	2017-06-12 09:33:42 -04:00
Jakub Zakrzewski	ed6384ecbf	[Python] Use appropriate integer types when calling native code. (#2361 ) Don't use implicit conversions to c_int, which incidentally happen to work on (some) 64-bit platforms, but: * may lead to truncation of the input value to a 32-bit signed int, * cause segfaults on some 32-bit architectures (tested on Ubuntu ARM, but is also the likely cause of issue #1707). Also, when passing references use explicit 64-bit integers, where needed, instead of c_ulong, which is not guaranteed to be this large.	2017-06-02 10:16:54 -07:00
Juang, Yi-Lin	6776292951	Minor cleanup (#2342 ) * Clean up demo of multiclass classification * Remove extra space	2017-05-26 09:40:41 -04:00
gaw89	0f3a404d91	Sklearn kwargs (#2338 ) * Added kwargs support for Sklearn API * Updated NEWS and CONTRIBUTORS * Fixed CONTRIBUTORS.md * Added clarification of *kwargs and test for proper usage Fixed lint error * Fixed more lint errors and clf assigned but never used * Fixed more lint errors * Fixed more lint errors * Fixed issue with changes from different branch bleeding over * Fixed issue with changes from other branch bleeding over * Added note that kwargs may not be compatible with Sklearn * Fixed linting on kwargs note	2017-05-23 21:47:53 -05:00
gaw89	6cea1e3fb7	Sklearn convention update (#2323 ) * Added n_jobs and random_state to keep up to date with sklearn API. Deprecated nthread and seed. Added tests for new params and deprecations. * Fixed docstring to reflect updates to n_jobs and random_state. * Fixed whitespace issues and removed nose import. * Added deprecation note for nthread and seed in docstring. * Attempted fix of deprecation tests. * Second attempted fix to tests. * Set n_jobs to 1.	2017-05-22 08:22:05 -05:00
jayzed82	29289d2302	Add option to choose booster in scikit intreface (gbtree by default) (#2303 ) * Add option to choose booster in scikit intreface (gbtree by default) * Add option to choose booster in scikit intreface: complete docstring. * Fix XGBClassifier to work with booster option * Added test case for gblinear booster	2017-05-18 23:12:27 -04:00
Maurus Cuelenaere	6bd1869026	Add prediction of feature contributions (#2003 ) * Add prediction of feature contributions This implements the idea described at http://blog.datadive.net/interpreting-random-forests/ which tries to give insight in how a prediction is composed of its feature contributions and a bias. * Support multi-class models * Calculate learning_rate per-tree instead of using the one from the first tree * Do not rely on node.base_weight * learning_rate having the same value as the node mean value (aka leaf value, if it were a leaf); instead calculate them (lazily) on-the-fly * Add simple test for contributions feature * Check against param.num_nodes instead of checking for non-zero length * Loop over all roots instead of only the first	2017-05-14 00:58:10 -05:00
Liam Huang	3a2b8332a6	bugfix: when metric's name contains `-` (#2090 ) When metric's name contains `-`, Python will complain about insufficient arguments to unpack.	2017-03-16 10:36:39 -07:00
Matthew R. Becker	4a63f4ab43	BUG make sure to specify no openmp for some mac osx builds properly (#2095 )	2017-03-10 18:36:15 -08:00
Holger Peters	95510b9667	Inform setuptools that this is a binary package (#1996 ) * Inform setuptools that this is a binary package that needs platform-tags in wheel names. This fixes issue #1995 . * PEP8 Formatting * Add docstring	2017-03-07 09:26:04 -06:00
Eric Liu	7927031ffe	print_evaluation callback output on last iteration (#2036 ) verbose_eval docs claim it will log the last iteration (http://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.train). this is also consistent w/the behavior from 0.4. not a huge deal but I found it handy to see the last iter's result b/c my period is usually large. this doesn't address logging the last stage found by early_stopping (as noted in docs) as I'm not sure how to do that.	2017-02-24 23:06:48 -05:00
yexu15	179b384e39	A fix regarding the compatibility with python 2.6 (#1981 ) * A fix regarding the compatibility with python 2.6 the syntax of {n: self.attr(n) for n in attr_names} is illegal in python 2.6 * Update core.py add a space after comma	2017-01-29 20:18:28 -08:00
Srivatsan Ramanujam	036ee55fe0	adding sample weights for XGBRegressor (was this forgotten?) (#1874 )	2017-01-21 11:58:03 -08:00
wxchan	a073a2c3d4	fix ylim with max_num_features in python plot_importance (#1974 )	2017-01-18 11:59:50 -08:00
Félix MIKAELIAN	a7d2833766	added the max_features parameter to the plot_importance function. (#1963 ) * added the max_features parameter to the plot_importance function. * renamed max_features parameter to max_num_features for better understanding * removed unwanted character in docstring	2017-01-16 14:49:47 -08:00
Andrey Tereskin	cfb9b11aa4	Make lib path relatrive to fix setup error #1932 (#1947 )	2017-01-09 10:40:24 -08:00
jokari69	fb0fc0c580	option to shuffle data in mknfolds (#1459 ) * option to shuffle data in mknfolds * removed possibility to run as stand alone test * split function def in 2 lines for lint * option to shuffle data in mknfolds * removed possibility to run as stand alone test * split function def in 2 lines for lint	2016-12-23 07:53:30 +08:00
Ian	167864da75	python package tree plotting support fmap (#1856 ) * to_graphviz and plot_tree support fmap * [python-package] add model_plot docstring	2016-12-13 07:36:17 -06:00
ccphillippi	dd477ac903	Move feature_importances_ to base XGBModel for XGBRegressor access (#1591 )	2016-12-01 10:17:37 -08:00
AbdealiJK	6f16f0ef58	Use bst_float consistently throughout (#1824 ) * Fix various typos * Add override to functions that are overridden gcc gives warnings about functions that are being overridden by not being marked as oveirridden. This fixes it. * Use bst_float consistently Use bst_float for all the variables that involve weight, leaf value, gradient, hessian, gain, loss_chg, predictions, base_margin, feature values. In some cases, when due to additions and so on the value can take a larger value, double is used. This ensures that type conversions are minimal and reduces loss of precision.	2016-11-30 10:02:10 -08:00
Jivan Roquet	0c19d4b029	[python-package] Provide a learning_rates parameter to xgb.cv() (#1770 ) * Allow using learning_rates parameter when doing CV - Create a new `callback_cv` method working when called from `xgb.cv()` - Rename existing `callback` into `callback_train` and make it the default callback - Get the logic out of the callbacks and place it into a common helper * Add a learning_rates parameter to cv() * lint * remove caller explicit reference * callback is aware of its calling context * remove caller argument * remove learning_rates param * restore learning_rates for training, but deprecated * lint * lint line too long * quick example for predefined callbacks	2016-11-24 09:49:07 -08:00
Yuan (Terry) Tang	ca0069b708	Fix typo - eval_metric in param should be dictionary (#1791 )	2016-11-20 18:52:41 -06:00
Nan Zhu	5217e53156	stylistic fix (#1789 ) * stylistic fix * try multiple repos * fix * fix	2016-11-19 22:03:10 -05:00
baderbuddy	c52b2faba4	Added license information (#1783 ) Added license information to the setup.py	2016-11-17 13:36:47 -08:00
Zhongxiao Ma	55bfc29942	keep builtin evaluations while using customized evaluation function (#1624 ) * keep builtin evaluations while using customized evaluation function * fix concat bytes to str	2016-11-10 12:40:48 -08:00
AbdealiJK	b94fcab4dc	Add dump_format=json option (#1726 ) * Add format to the params accepted by DumpModel Currently, only the test format is supported when trying to dump a model. The plan is to add more such formats like JSON which are easy to read and/or parse by machines. And to make the interface for this even more generic to allow other formats to be added. Hence, we make some modifications to make these function generic and accept a new parameter "format" which signifies the format of the dump to be created. * Fix typos and errors in docs * plugin: Mention all the register macros available Document the register macros currently available to the plugin writers so they know what exactly can be extended using hooks. * sparce_page_source: Use same arg name in .h and .cc * gbm: Add JSON dump The dump_format argument can be used to specify what type of dump file should be created. Add functionality to dump gblinear and gbtree into a JSON file. The JSON file has an array, each item is a JSON object for the tree. For gblinear: - The item is the bias and weights vectors For gbtree: - The item is the root node. The root node has a attribute "children" which holds the children nodes. This happens recursively. * core.py: Add arg dump_format for get_dump()	2016-11-04 09:55:25 -07:00
Eric Liu	9b2e41340b	make DMatrix._init_from_npy2d only copy data when necessary (#1637 ) * make DMatrix._init_from_npy2d only copy data when necessary When creating DMatrix from a 2d ndarray, it can unnecessarily copy the input data. This can be problematic when the data is already very large--running out of memory. The copy is temporary (going out of scope at the end of this function) but it still adds to peak memory usage. ``numpy.array`` copies its input no matter what by default. By adding ``copy=False``, it will only do so when necessary. Since XGDMatrixCreateFromMat is readonly on the input buffer, this copy is not needed. Also added comments explaining when a copy can happen (if data ordering/layout is wrong or if type is not 32-bit float). * remove whitespace	2016-10-20 09:30:52 -07:00
ziguang1216	94a9e3222e	[python-package] Fix the issue #1439 (#1666 ) Fix 1439 Fix python_wrapper when eval set name contain '-' will cause early_stop maximize variable con't set to True propely Change-Id: Ib0595afd4ae7b445a84c00a3a8faeccc506c6d13	2016-10-18 10:22:51 -07:00
Yuan (Terry) Tang	63829d656c	Fix mknfold using new StratifiedKFold API (#1660 )	2016-10-12 14:43:37 -07:00
Jonathan Rahn	c8ae52f17a	add scikit-learn v0.18 compatibility (#1636 ) * add scikit-learn v0.18 compatibility import KFold & StratifiedKFold from sklearn.model_selection instead of sklearn.cross_validation * change DeprecationWarning to ImportError DeprecationWarning isn't an exception, so it should work the other way around.	2016-10-09 20:37:28 -07:00
Vadim Khotilovich	693ddb860e	More robust DMatrix creation from a sparse matrix (#1606 ) * [CORE] DMatrix from sparse w/ explicit #col #row; safer arg types * [python-package] c-api change for _init_from_csr _init_from_csc * fix spaces * [R-package] adopt the new XGDMatrixCreateFromCSCEx interface * [CORE] redirect old sparse creators to new ones	2016-09-25 10:01:22 -07:00
chanis	62830be29d	[python-package] modify libpath.py and fix typos (#1594 ) * Update Makefile * Update Makefile * modify __init__.py * modified libpath.py and fixed typos	2016-09-21 10:12:19 -07:00
chanis	d8876b0b73	[python-package] modify __init__.py (#1587 ) * Update Makefile * Update Makefile * modify __init__.py	2016-09-19 09:43:36 -07:00
闻波	8cdfec71b3	remove a redundant sentence, and a word 'and' (#1526 ) * fix a typo * fix a typo and some code format * Update training.py * delete redundant sentence	2016-08-31 11:51:40 -07:00

1 2 3 4 5

213 Commits