xgboost

Author	SHA1	Message	Date
Jiaming Yuan	c709f2aaaf	Fix evaluation result for XGBRanker. (#6594 ) * Remove duplicated code, which fixes typo `evals_result` -> `evals_result_`.	2021-01-12 09:36:41 +08:00
Jiaming Yuan	80065d571e	[dask] Add DaskXGBRanker (#6576 ) * Initial support for distributed LTR using dask. * Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.	2021-01-08 18:35:09 +08:00
Jiaming Yuan	7c9dcbedbc	Fix `best_ntree_limit` for dart and gblinear. (#6579 )	2021-01-08 10:05:39 +08:00
Jiaming Yuan	f5ff90cd87	Support `_estimator_type`. (#6582 ) * Use `_estimator_type`. For more info, see: https://scikit-learn.org/stable/developers/develop.html#estimator-types * Model trained from dask can be loaded by single node skl interface.	2021-01-08 10:01:16 +08:00
Jiaming Yuan	60cfd14349	[dask, sklearn] Fix predict proba. (#6566 ) * For sklearn: - Handles user defined objective function. - Handles `softmax`. * For dask: - Use the implementation from sklearn, the previous implementation doesn't perform any extra handling.	2021-01-05 08:29:06 +08:00
Jiaming Yuan	ca3da55de4	Support early stopping with training continuation, correct num boosted rounds. (#6506 ) * Implement early stopping with training continuation. * Add new C API for obtaining boosted rounds. * Fix off by 1 in `save_best`. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-17 19:59:19 +08:00
Jiaming Yuan	a30461cf87	[dask] Support all parameters in regressor and classifier. (#6471 ) * Add eval_metric. * Add callback. * Add feature weights. * Add custom objective.	2020-12-14 07:35:56 +08:00
Jiaming Yuan	d6386e45e8	Fix filtering callable objects in skl xgb param. (#6466 ) Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-05 17:20:36 +08:00
Jiaming Yuan	2ce2a1a4d8	[SKL] Propagate parameters to booster during set_param. (#6416 )	2020-11-20 20:37:35 +08:00
Philip Hyunsu Cho	9c9070aea2	Use pytest conventions consistently (#6337 ) * Do not derive from unittest.TestCase (not needed for pytest) * assertRaises -> pytest.raises * Simplify test_empty_dmatrix with test parametrization * setUpClass -> setup_class, tearDownClass -> teardown_class * Don't import unittest; import pytest * Use plain assert * Use parametrized tests in more places * Fix test_gpu_with_sklearn.py * Put back run_empty_dmatrix_reg / run_empty_dmatrix_cls * Fix test_eta_decay_gpu_hist * Add parametrized tests for monotone constraints * Fix test names * Remove test parametrization * Revise test_slice to be not flaky	2020-11-19 17:00:15 -08:00
Jiaming Yuan	fcfeb4959c	Deprecate positional arguments. (#6365 ) Deprecate positional arguments in following functions: - `__init__` for all classes in sklearn module. - `fit` method for all classes in sklearn module. - dask interface. - `set_info` for `DMatrix` class. Refactor the evaluation matrices handling.	2020-11-13 11:10:30 +08:00
Jiaming Yuan	184e2eac7d	Add period to evaluation monitor. (#6348 )	2020-11-10 07:47:48 +08:00
Philip Hyunsu Cho	c8ec62103a	Deprecate LabelEncoder in XGBClassifier; Enable cuDF/cuPy inputs in XGBClassifier (#6269 ) * Deprecate LabelEncoder in XGBClassifier; skip LabelEncoder for cuDF/cuPy inputs * Add unit tests for cuDF and cuPy inputs with XGBClassifier * Fix lint * Clarify warning * Move use_label_encoder option to XGBClassifier constructor * Add a test for cudf.Series * Add use_label_encoder to XGBRFClassifier doc * Address reviewer feedback	2020-10-26 13:20:51 -07:00
Jiaming Yuan	4d99c58a5f	Feature weights (#5962 )	2020-08-18 19:55:41 +08:00
Jiaming Yuan	f5fdcbe194	Disable feature validation on sklearn predict prob. (#5953 ) * Fix issue when scikit learn interface receives transformed inputs.	2020-07-29 19:26:44 +08:00
Philip Hyunsu Cho	ac9136ee49	Further improvements and savings in Jenkins pipeline (#5904 ) * Publish artifacts only on the master and release branches * Build CUDA only for Compute Capability 7.5 when building PRs * Run all Windows jobs in a single worker image * Build nightly XGBoost4J SNAPSHOT JARs with Scala 2.12 only * Show skipped Python tests on Windows * Make Graphviz optional for Python tests * Add back C++ tests * Unstash xgboost_cpp_tests * Fix label to CUDA 10.1 * Install cuPy for CUDA 10.1 * Install jsonschema * Address reviewer's feedback	2020-07-18 03:30:40 -07:00
Alex	ae18a094b0	Add new skl model attribute for number of features (#5780 )	2020-06-15 18:01:59 +08:00
Jiaming Yuan	93df871c8c	Assert matching length of evaluation inputs. (#5540 )	2020-04-18 06:52:55 +08:00
Jiaming Yuan	c69a19e2b1	Fix skl nan tag. (#5538 )	2020-04-18 06:52:17 +08:00
Jiaming Yuan	dc2950fd90	Fix checking booster. (#5505 ) * Use `get_params()` instead of `getattr` intrinsic.	2020-04-10 12:21:21 +08:00
Jiaming Yuan	c218d8ffbf	Enable parameter validation for skl. (#5477 )	2020-04-03 10:23:58 +08:00
Philip Hyunsu Cho	cfae247231	Fix a small typo in sklearn.py that broke multiple eval metrics (#5341 )	2020-02-22 19:02:37 +08:00
Jiaming Yuan	a5cc112eea	Export JSON config in `get_params`. (#5256 )	2020-02-03 12:46:51 +08:00
Jiaming Yuan	472ded549d	Save Scikit-Learn attributes into learner attributes. (#5245 ) * Remove the recommendation for pickle. * Save skl attributes in booster.attr * Test loading scikit-learn model with native booster.	2020-01-30 16:00:18 +08:00
Jiaming Yuan	40680368cf	Add constraint parameters to Scikit-Learn interface. (#5227 ) * Add document for constraints. * Fix a format error in doc for objective function.	2020-01-25 11:12:02 +08:00
OrdoAbChao	b4f952bd22	[Breaking] Remove Scikit-Learn default parameters (#5130 ) * Simplify Scikit-Learn parameter management. * Copy base class for removing duplicated parameter signatures. * Set all parameters to None. * Handle None in set_param. * Extract the doc. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-23 20:25:20 +08:00
Jiaming Yuan	1891cc766d	Fix metainfo from DataFrame. (#5216 ) * Fix metainfo from DataFrame. * Unify helper functions for data and meta.	2020-01-22 16:29:44 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Jiaming Yuan	0202e04a8e	Add base margin to sklearn interface. (#5151 )	2019-12-24 09:43:41 +08:00
Jiaming Yuan	a4f5c86276	Allow using RandomState object from Numpy in sklearn interface. (#5049 )	2019-11-19 10:56:39 +08:00
Jiaming Yuan	4bbf062ed3	[Breaking] Update sklearn interface. (#4929 ) * Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909. * Check data shape. Fix #4896. * Check element of eval_set is tuple. Fix #4875 * Add doc for random_state with hogwild. Fixes #4919	2019-10-12 02:50:09 -04:00
Jiaming Yuan	5374f52531	Complete cudf support. (#4850 ) * Handles missing value. * Accept all floating point and integer types. * Move to cudf 9.0 API. * Remove requirement on `null_count`. * Arbitrary column types support.	2019-09-16 23:52:00 -04:00
Rong Ou	851b5b3808	Remove gpu_exact tree method (#4742 )	2019-08-07 11:43:20 +12:00
Oleksandr Pryimak	986fee6022	pytest tests/python fails if no pandas installed (#4620 ) * _maybe_pandas_xxx should return their arguments unchanged if no pandas installed * Tests should not assume pandas is installed * Mark tests which require pandas as such	2019-07-01 02:54:08 +08:00
Jiaming Yuan	8bdf15120a	Implement tree model dump with code generator. (#4602 ) * Implement tree model dump with a code generator. * Split up generators. * Implement graphviz generator. * Use pattern matching. * [Breaking] Return a Source in `to_graphviz` instead of Digraph in Python package. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-06-26 15:20:44 +08:00
Jiaming Yuan	29a1356669	Deprecate `reg:linear' in favor of` reg:squarederror'. (#4267 ) * Deprecate `reg:linear' in favor of `reg:squarederror'. * Replace the use of `reg:linear'. * Replace the use of `silent`.	2019-03-17 17:55:04 +08:00
Andy Adinets	4352fcdb15	Brought the silent parameter for the SKLearn-like API back, marked it deprecated. (#4255 ) * Brought the silent parameter for the SKLearn-like API back, marked it deprecated. - added deprecation notice and warning - removed silent from the tests for the SKLearn-like API	2019-03-14 09:45:08 +13:00
Andy Adinets	a36c3ed4f4	Added SKLearn-like random forest Python API. (#4148 ) * Added SKLearn-like random forest Python API. - added XGBRFClassifier and XGBRFRegressor classes to SKL-like xgboost API - also added n_gpus and gpu_id parameters to SKL classes - added documentation describing how to use xgboost for random forests, as well as existing caveats	2019-03-12 22:28:19 +08:00
tmitanitky	59f868bc60	enable xgb_model in scklearn XGBClassifier and test. (#4092 ) * Enable xgb_model parameter in XGClassifier scikit-learn API https://github.com/dmlc/xgboost/issues/3049 * add test_XGBClassifier_resume(): test for xgb_model parameter in XGBClassifier API. * Update test_with_sklearn.py * Fix lint	2019-01-31 11:29:19 -08:00
Jiaming Yuan	2ea0f887c1	Refactor Python tests. (#3897 ) * Deprecate nose tests. * Format python tests.	2018-11-15 13:56:33 +13:00
Dr. Kashif Rasul	143475b27b	use gain for sklearn feature_importances_ (#3876 ) * use gain for sklearn feature_importances_ `gain` is a better feature importance criteria than the currently used `weight` * added importance_type to class * fixed test * white space * fix variable name * fix deprecation warning * fix exp array * white spaces	2018-11-13 03:30:40 -08:00
Philip Hyunsu Cho	ad6e0d55f1	Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV (#3873 ) * Fix coef_ and intercept_ signature to be compatible with sklearn.RFECV * Fix lint * Fix lint	2018-11-08 19:41:35 -08:00
Rory Mitchell	5d6baed998	Allow sklearn grid search over parameters specified as kwargs (#3791 )	2018-10-14 12:44:53 +13:00
Philip Hyunsu Cho	51478a39c9	Fix #3730 : scikit-learn 0.20 compatibility fix (#3731 ) * Fix #3730: scikit-learn 0.20 compatibility fix sklearn.cross_validation has been removed from scikit-learn 0.20, so replace it with sklearn.model_selection * Display test names for Python tests for clarity	2018-09-27 15:03:05 -07:00
Philip Hyunsu Cho	86d88c0758	Fix #3648 : XGBClassifier.predict() should return margin scores when output_margin=True (#3651 ) * Fix #3648: XGBClassifier.predict() should return margin scores when output_margin=True * Fix tests to reflect correct implementation of XGBClassifier.predict(output_margin=True) * Fix flaky test test_with_sklearn.test_sklearn_api_gblinear	2018-08-30 21:05:05 -07:00
Shiki-H	24a268a2e3	sklearn api for ranking (#3560 ) * added xgbranker * fixed predict method and ranking test * reformatted code in accordance with pep8 * fixed lint error * fixed docstring and added checks on objective * added ranking demo for python * fixed suffix in rank.py	2018-08-21 08:26:48 -07:00
Mike Liu	594bcea83e	Save and load model in sklearn API (#3192 ) * Add (load\|save)_model to XGBModel * Add docstring * Fix docstring * Fix mixed use of space and tab * Add a test * Fix Flake8 style errors	2018-06-30 19:21:49 +00:00
pdavalo	480e3fd764	Sklearn: validation set weights (#2354 ) * Add option to use weights when evaluating metrics in validation sets * Add test for validation-set weights functionality * simplify case with no weights for test sets * fix lint issues	2018-05-23 17:06:20 -07:00
Rory Mitchell	9fa45d3a9c	Fix bug with gpu_predictor caching behaviour (#3177 ) * Fixes #3162	2018-03-18 10:35:10 +13:00
Tsukasa OMOTO	8d15024ac7	python: follow the default warning filters of Python (#2666 ) * python: follow the default warning filters of Python https://docs.python.org/3/library/warnings.html#default-warning-filters * update tests * update tests	2017-09-27 03:03:01 -04:00

1 2 3

121 Commits