xgboost

Author	SHA1	Message	Date
Harvey	0552ca8021	Fix typo (#7469 )	2021-11-23 08:58:45 +08:00
Jiaming Yuan	97d7582457	Delay breaking changes to 1.6. (#7420 ) The patch is too big to be backported.	2021-11-12 16:46:03 +08:00
Jiaming Yuan	c74df31bf9	Cleanup the `train` function. (#7377 ) * Move attribute setter to callback. * Remove the internal train function. * Remove unnecessary initialization.	2021-11-02 18:00:26 +08:00
Jiaming Yuan	c6769488b3	Typehint for subset of core API. (#7348 )	2021-10-28 20:47:04 +08:00
Jiaming Yuan	45aef75cca	Move skl `eval_metric` and `early_stopping rounds` to model params. (#6751 ) A new parameter `custom_metric` is added to `train` and `cv` to distinguish the behaviour from the old `feval`. And `feval` is deprecated. The new `custom_metric` receives transformed prediction when the built-in objective is used. This enables XGBoost to use cost functions from other libraries like scikit-learn directly without going through the definition of the link function. `eval_metric` and `early_stopping_rounds` in sklearn interface are moved from `fit` to `__init__` and is now saved as part of the scikit-learn model. The old ones in `fit` function are now deprecated. The new `eval_metric` in `__init__` has the same new behaviour as `custom_metric`. Added more detailed documents for the behaviour of custom objective and metric.	2021-10-28 17:20:20 +08:00
Jiaming Yuan	c42e3fbcf3	[doc] Fix early stopping document. (#7334 )	2021-10-18 11:21:16 -07:00
Jiaming Yuan	69d3b1b8b4	Remove old callback deprecated in 1.3. (#7280 )	2021-10-08 17:24:59 +08:00
Jiaming Yuan	578de9f762	Fix cv `verbose_eval` (#7291 )	2021-10-08 12:28:38 +08:00
Jiaming Yuan	d080b5a953	Fix model slicing. (#7149 ) * Use correct pointer. * Remove best_iteration/best_score.	2021-08-03 11:51:56 +08:00
Jiaming Yuan	93f3acdef9	Fix with latest pylint. (#7071 )	2021-07-02 21:26:00 +08:00
Jiaming Yuan	86e60e3ba8	Guard against index error in prediction. (#6982 ) * Remove `best_ntree_limit` from documents.	2021-05-25 23:24:59 +08:00
Jiaming Yuan	a9b4a95225	Fix learning rate scheduler with cv. (#6720 ) * Expose more methods in cvpack and packed booster. * Fix cv context in deprecated callbacks. * Fix document.	2021-02-28 13:57:42 +08:00
Jiaming Yuan	4656b09d5d	[breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668 ) * Add a new API function for predicting on `DMatrix`. This function aligns with rest of the `XGBoosterPredictFrom` functions on semantic of function arguments. Purge `ntree_limit` from libxgboost, use iteration instead. * [dask] Use `inplace_predict` by default for dask sklearn models. * [dask] Run prediction shape inference on worker instead of client. The breaking change is in the Python sklearn `apply` function, I made it to be consistent with other prediction functions where `best_iteration` is used by default.	2021-02-08 18:26:32 +08:00
Jiaming Yuan	d6d72de339	Revert ntree limit fix (#6616 ) The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.	2021-01-19 23:51:16 +08:00
Jiaming Yuan	0027220aa0	[breaking] Remove duplicated predict functions, Fix attributes IO. (#6593 ) * Fix attributes not being restored. * Rename all `data` to `X`. [breaking]	2021-01-13 16:56:49 +08:00
Jiaming Yuan	80065d571e	[dask] Add DaskXGBRanker (#6576 ) * Initial support for distributed LTR using dask. * Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.	2021-01-08 18:35:09 +08:00
Jiaming Yuan	7c9dcbedbc	Fix `best_ntree_limit` for dart and gblinear. (#6579 )	2021-01-08 10:05:39 +08:00
Jiaming Yuan	516a93d25c	Fix `best_ntree_limit`. (#6569 )	2021-01-03 05:58:54 +08:00
Jiaming Yuan	ca3da55de4	Support early stopping with training continuation, correct num boosted rounds. (#6506 ) * Implement early stopping with training continuation. * Add new C API for obtaining boosted rounds. * Fix off by 1 in `save_best`. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-17 19:59:19 +08:00
Jiaming Yuan	3c3f026ec1	Move metric configuration into booster. (#6504 )	2020-12-16 05:35:04 +08:00
Jiaming Yuan	184e2eac7d	Add period to evaluation monitor. (#6348 )	2020-11-10 07:47:48 +08:00
Jiaming Yuan	2cc9662005	Support slicing tree model (#6302 ) This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone. * Implement the save_best option in early stopping. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-11-02 23:27:39 -08:00
Jiaming Yuan	6ff331b705	Fix Python callback. (#6320 )	2020-10-30 05:03:44 +08:00
Philip Hyunsu Cho	677f676172	Use UserWarning for old callback, as DeprecationWarning is not visible (#6270 )	2020-10-22 01:10:52 -07:00
Jiaming Yuan	ab5b35134f	Rework Python callback functions. (#6199 ) * Define a new callback interface for Python. * Deprecate the old callbacks. * Enable early stopping on dask.	2020-10-10 17:52:36 +08:00
Jiaming Yuan	b809f5d8b8	Don't set seed on CLI interface. (#5563 )	2020-04-20 12:17:03 +08:00
Rory Mitchell	093e2227e3	Serialise booster after training to reset state (#5484 ) * Serialise booster after training to reset state * Prevent process_type being set on load * Check for correct updater sequence	2020-04-11 16:27:12 +12:00
Philip Hyunsu Cho	cfae247231	Fix a small typo in sklearn.py that broke multiple eval metrics (#5341 )	2020-02-22 19:02:37 +08:00
OrdoAbChao	b4f952bd22	[Breaking] Remove Scikit-Learn default parameters (#5130 ) * Simplify Scikit-Learn parameter management. * Copy base class for removing duplicated parameter signatures. * Set all parameters to None. * Handle None in set_param. * Extract the doc. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-23 20:25:20 +08:00
Jiaming Yuan	6848d0426f	Clean up Python 2 compatibility code. (#5161 )	2019-12-27 18:34:53 +08:00
Jiaming Yuan	298ebe68ac	[Breaking] Remove `learning_rates` in Python. (#5155 ) * Remove `learning_rates`. It's been deprecated since we have callback. * Set `before_iteration` of `reset_learning_rate` to False to preserve the initial learning rate, and comply to the term "reset". Closes #4709. * Tests for various `tree_method`.	2019-12-24 14:25:48 +08:00
Jiaming Yuan	3136185bc5	JSON configuration IO. (#5111 ) * Add saving/loading JSON configuration. * Implement Python pickle interface with new IO routines. * Basic tests for training continuation.	2019-12-15 17:31:53 +08:00
Philip Hyunsu Cho	1aaf4a679d	Fix early stopping in the Python package (#4638 ) * Fix #4630, #4421: Preserve correct ordering between metrics, and always use last metric for early stopping * Clarify semantics of early stopping in presence of multiple valid sets and metrics * Add a test * Fix lint	2019-07-07 01:01:03 -07:00
Bryan Woods	278562db13	Add support for cross-validation using query ID (#4474 ) * adding support for matrix slicing with query ID for cross-validation * hail mary test of unrar installation for windows tests * trying to modify tests to run in Github CI * Remove dependency on wget and unrar * Save error log from R test * Relax assertion in test_training * Use int instead of bool in C function interface * Revise R interface * Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility	2019-05-23 10:45:02 -07:00
Philip Hyunsu Cho	bbe0dbd7ec	Migrate pylint check to Python 3 (#4381 ) * Migrate lint to Python 3 * Fix lint errors * Use Miniconda3 to use Python 3.7 * Use latest pylint and astroid	2019-04-21 01:01:54 -07:00
mrgutkun	4b43810f51	Fix #3663 : Allow sklearn API to use callbacks (#3682 ) * Fix #3663: Allow sklearn API to use callbacks * Fix lint * Add Callback API to Python API doc	2018-09-07 13:51:26 -07:00
Philip Hyunsu Cho	5a8bbb39a1	Revert #3677 and #3674 (#3678 ) * Revert "Add scikit-learn as dependency for doc build (#3677)" This reverts commit 308f664ade0547242608e21f6198c895415f03da. * Revert "Add scikit-learn tests (#3674)" This reverts commit d176a0fbc8165e3afe3e42ff464ab7b253211555.	2018-09-06 20:43:17 -07:00
Philip Hyunsu Cho	d176a0fbc8	Add scikit-learn tests (#3674 ) * Add scikit-learn tests Goal is to pass scikit-learn's check_estimator() for XGBClassifier, XGBRegressor, and XGBRanker. It is actually not possible to do so entirely, since check_estimator() assumes that NaN is disallowed, but XGBoost allows for NaN as missing values. However, it is always good ideas to add some checks inspired by check_estimator(). * Fix lint * Fix lint	2018-09-06 09:55:28 -07:00
Philip Hyunsu Cho	4ed8a88240	Update Python API doc (#3619 ) * Add XGBRanker to Python API doc * Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel * Add table of contents to Python API doc * Skip JVM doc download if not available * Show inherited members for XGBRegressor and XGBRanker * Expose XGBRanker to Python XGBoost module directory * Add docstring to XGBRegressor.predict() and XGBRanker.predict() * Fix rendering errors in Python docstrings * Fix lint	2018-08-22 18:59:30 -07:00
Oliver Laslett	18813a26ab	allow arbitrary cross validation fold indices (#3353 ) * allow arbitrary cross validation fold indices - use training indices passed to `folds` parameter in `training.cv` - update doc string * add tests for arbitrary fold indices	2018-06-30 19:23:49 +00:00
Vadim Khotilovich	2b3a4318c5	Several fixes (#2572 ) * repared serialization after update process; fixes #2545 * non-stratified folds in python could omit some data instances * Makefile: fixes for older makes on windows; clean R-package too * make cub to be a shallow submodule * improve $(MAKE) recovery	2017-08-06 13:03:50 -05:00
jokari69	fb0fc0c580	option to shuffle data in mknfolds (#1459 ) * option to shuffle data in mknfolds * removed possibility to run as stand alone test * split function def in 2 lines for lint * option to shuffle data in mknfolds * removed possibility to run as stand alone test * split function def in 2 lines for lint	2016-12-23 07:53:30 +08:00
AbdealiJK	6f16f0ef58	Use bst_float consistently throughout (#1824 ) * Fix various typos * Add override to functions that are overridden gcc gives warnings about functions that are being overridden by not being marked as oveirridden. This fixes it. * Use bst_float consistently Use bst_float for all the variables that involve weight, leaf value, gradient, hessian, gain, loss_chg, predictions, base_margin, feature values. In some cases, when due to additions and so on the value can take a larger value, double is used. This ensures that type conversions are minimal and reduces loss of precision.	2016-11-30 10:02:10 -08:00
Jivan Roquet	0c19d4b029	[python-package] Provide a learning_rates parameter to xgb.cv() (#1770 ) * Allow using learning_rates parameter when doing CV - Create a new `callback_cv` method working when called from `xgb.cv()` - Rename existing `callback` into `callback_train` and make it the default callback - Get the logic out of the callbacks and place it into a common helper * Add a learning_rates parameter to cv() * lint * remove caller explicit reference * callback is aware of its calling context * remove caller argument * remove learning_rates param * restore learning_rates for training, but deprecated * lint * lint line too long * quick example for predefined callbacks	2016-11-24 09:49:07 -08:00
Yuan (Terry) Tang	ca0069b708	Fix typo - eval_metric in param should be dictionary (#1791 )	2016-11-20 18:52:41 -06:00
Yuan (Terry) Tang	63829d656c	Fix mknfold using new StratifiedKFold API (#1660 )	2016-10-12 14:43:37 -07:00
闻波	8cdfec71b3	remove a redundant sentence, and a word 'and' (#1526 ) * fix a typo * fix a typo and some code format * Update training.py * delete redundant sentence	2016-08-31 11:51:40 -07:00
tqchen	149589c583	[PYTHON] Refactor trainnig API to use callback	2016-05-19 21:31:23 -07:00
Shayne Kang	bf24d6ae98	fix VisibleDeprecationWarning	2016-05-08 01:44:04 +09:00
Faron	ad3f49e881	[py] eta decay bugfix	2016-04-30 15:51:57 +02:00

1 2

88 Commits