xgboost

Author	SHA1	Message	Date
Scott Lundberg	78c4188cec	SHAP values for feature contributions (#2438 ) * SHAP values for feature contributions * Fix commenting error * New polynomial time SHAP value estimation algorithm * Update API to support SHAP values * Fix merge conflicts with updates in master * Correct submodule hashes * Fix variable sized stack allocation * Make lint happy * Add docs * Fix typo * Adjust tolerances * Remove unneeded def * Fixed cpp test setup * Updated R API and cleaned up * Fixed test typo	2017-10-12 12:35:51 -07:00
PSEUDOTENSOR / Jonathan McKinney	6b375f6ad8	Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530 ) * Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.	2017-07-21 14:43:17 +12:00
Maurus Cuelenaere	6bd1869026	Add prediction of feature contributions (#2003 ) * Add prediction of feature contributions This implements the idea described at http://blog.datadive.net/interpreting-random-forests/ which tries to give insight in how a prediction is composed of its feature contributions and a bias. * Support multi-class models * Calculate learning_rate per-tree instead of using the one from the first tree * Do not rely on node.base_weight * learning_rate having the same value as the node mean value (aka leaf value, if it were a leaf); instead calculate them (lazily) on-the-fly * Add simple test for contributions feature * Check against param.num_nodes instead of checking for non-zero length * Loop over all roots instead of only the first	2017-05-14 00:58:10 -05:00
jokari69	fb0fc0c580	option to shuffle data in mknfolds (#1459 ) * option to shuffle data in mknfolds * removed possibility to run as stand alone test * split function def in 2 lines for lint * option to shuffle data in mknfolds * removed possibility to run as stand alone test * split function def in 2 lines for lint	2016-12-23 07:53:30 +08:00
AbdealiJK	b94fcab4dc	Add dump_format=json option (#1726 ) * Add format to the params accepted by DumpModel Currently, only the test format is supported when trying to dump a model. The plan is to add more such formats like JSON which are easy to read and/or parse by machines. And to make the interface for this even more generic to allow other formats to be added. Hence, we make some modifications to make these function generic and accept a new parameter "format" which signifies the format of the dump to be created. * Fix typos and errors in docs * plugin: Mention all the register macros available Document the register macros currently available to the plugin writers so they know what exactly can be extended using hooks. * sparce_page_source: Use same arg name in .h and .cc * gbm: Add JSON dump The dump_format argument can be used to specify what type of dump file should be created. Add functionality to dump gblinear and gbtree into a JSON file. The JSON file has an array, each item is a JSON object for the tree. For gblinear: - The item is the bias and weights vectors For gbtree: - The item is the root node. The root node has a attribute "children" which holds the children nodes. This happens recursively. * core.py: Add arg dump_format for get_dump()	2016-11-04 09:55:25 -07:00
tqchen	149589c583	[PYTHON] Refactor trainnig API to use callback	2016-05-19 21:31:23 -07:00
Alistair Johnson	6750c8b743	Added other feature importances in python package (#1135 ) * added new function to calculate other feature importances * added capability to plot other feature importance measures * changed plotting default to fscore * added info on importance_type to boilerplate comment * updated text of error statement * added self module name to fix call * added unit test for feature importances * style fixes	2016-05-02 12:25:24 -05:00
sinhrks	6bab164d80	Bug mixing DMatrix's with and without feature names	2016-04-30 14:42:57 +09:00
sinhrks	8fc2456c87	Enable flake8	2016-04-24 17:32:31 +09:00
tqchen	ec2fb5bc48	Fix multi-class loading	2016-03-10 19:22:26 -08:00
terrytangyuan	803a6fe474	Separate dependencies and lightweight test env for Python	2016-02-28 20:11:10 -06:00
tqchen	90bc7f8f6b	[TEST] Fix travis test when reading hdfs	2016-02-27 18:15:32 -08:00
Tianqi Chen	758a77de9c	Fix testcase after update and allow hdfs load	2016-02-26 17:04:51 -08:00
ivallesp	ed5c98f0ee	re-using the verbose-eval parameter in the cv and aggcv methods and tests adapted	2016-02-19 17:14:57 +01:00
FrozenFingerz	177259a0a7	unittest for cv bugfixes added	2015-12-29 14:13:40 +01:00
sinhrks	25c4fbd0cb	Cleanup pandas support	2015-11-13 06:55:04 +09:00
Johan Manders	b0f38e9352	Changed 4 tests Changed symbol test to give error on < sign, not on = sign Changed 3 other functions, so that float is used instead of q	2015-11-03 21:32:47 +01:00
sinhrks	1f19b78287	Python: adjusts plot_importance ylim	2015-10-25 03:16:53 +09:00
Tianqi Chen	d4d36eed45	Merge pull request #528 from terrytangyuan/test More Unit Tests for Python Package	2015-10-22 08:39:32 -07:00
terrytangyuan	ec2cdafec5	Added fixed random seed for tests (+1 squashed commit) Squashed commits: [76e3664] Added fixed random seed for tests	2015-10-21 23:38:41 -05:00
sinhrks	6f046327ac	Allow plot function to handle XGBModel	2015-10-22 01:00:54 +09:00
sinhrks	dbcb4c8729	Support non-str column names	2015-10-04 13:30:01 +09:00
Tianqi Chen	2859c190cd	Merge pull request #522 from sinhrks/pandas python DMatrix now accepts pandas DataFrame	2015-10-02 10:19:14 -07:00
sinhrks	b958c55ac6	CV returns ndarray or DataFrame	2015-10-02 22:38:03 +09:00
sinhrks	b943becc61	python DMatrix now accepts pandas DataFrame	2015-10-01 22:52:32 +09:00
sinhrks	f6f3473d17	Change to properties	2015-09-28 22:36:39 +09:00
sinhrks	db692a30e5	Add feature_types	2015-09-28 22:25:35 +09:00
sinhrks	f7d434aec2	Fix numpy array check logic	2015-09-17 22:51:44 +09:00
sinhrks	bb6b7ded55	Cleanup str roundtrip using ctypes	2015-09-17 04:10:19 +09:00
sinhrks	db0c9e1c2d	BUG: incorrect model_file results in segfault	2015-09-16 22:02:30 +09:00
sinhrks	6063d243eb	Mac build fix	2015-09-15 18:39:06 +09:00
sinhrks	48ac946d9f	Use ctypes	2015-09-14 22:12:19 +09:00
sinhrks	00702dc39b	Fix for python 3	2015-08-24 05:09:27 +09:00
sinhrks	d24b36adf9	ENH: Add visualization to python package	2015-08-16 00:57:21 +09:00
tqchen	f0421e9455	last check	2015-07-03 21:27:29 -07:00

35 Commits