xgboost

Author	SHA1	Message	Date
Jiaming Yuan	9b88495840	[multi] Implement weight feature importance. (#10700 )	2024-08-22 02:06:47 +08:00
Jiaming Yuan	3d8107adb8	Support doc link for the sklearn module. (#10287 )	2024-08-06 02:35:32 +08:00
Jiaming Yuan	a5a58102e5	Revamp the rabit implementation. (#10112 ) This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features: - Federated learning for both CPU and GPU. - NCCL. - More data types. - A unified interface for all the underlying implementations. - Improved timeout handling for both tracker and workers. - Exhausted tests with metrics (fixed a couple of bugs along the way). - A reusable tracker for Python and JVM packages.	2024-05-20 11:56:23 +08:00
Jiaming Yuan	73afef1a6e	Fixes for numpy 2.0. (#10252 )	2024-05-07 03:54:32 +08:00
Jiaming Yuan	837d44a345	Support more sklearn tags for testing. (#10230 )	2024-04-29 06:33:23 +08:00
Jiaming Yuan	8ea705e4d5	Support sample weight in sklearn custom objective. (#10050 )	2024-02-21 00:43:14 +08:00
Jiaming Yuan	69a17d5114	Fix with None input. (#10052 )	2024-02-20 22:34:22 +08:00
Jiaming Yuan	65d7bf2dfe	Handle np integer in model slice and prediction. (#10007 )	2024-01-26 04:58:48 +08:00
Jiaming Yuan	0798e36d73	[breaking] Remove deprecated parameters in the skl interface. (#9986 )	2024-01-15 20:40:05 +08:00
Jiaming Yuan	38dd91f491	Save model in ubj as the default. (#9947 )	2024-01-05 17:53:36 +08:00
Jiaming Yuan	5f7b5a6921	Add tests for pickling with custom obj and metric. (#9943 )	2024-01-04 14:52:48 +08:00
Jiaming Yuan	e9f149481e	[sklearn] Fix loading model attributes. (#9808 )	2023-11-27 17:19:01 +08:00
Jiaming Yuan	c3a0622b49	Fix using categorical data with the score function of ranker. (#9753 )	2023-11-07 07:29:11 +08:00
david-cortes	be20df8c23	[Python] Accept numpy generators as `random_state` (#9743 ) * accept numpy generators for random_state * make linter happy * fix tests	2023-11-01 16:20:44 -07:00
Jiaming Yuan	7f29a238e6	Return base score as intercept. (#9486 )	2023-08-19 12:28:02 +08:00
Jiaming Yuan	801116c307	Test scikit-learn model IO with gblinear. (#9459 )	2023-08-13 23:41:49 +08:00
Jiaming Yuan	16eb41936d	Handle the new `device` parameter in dask and demos. (#9386 ) * Handle the new `device` parameter in dask and demos. - Check no ordinal is specified in the dask interface. - Update demos. - Update dask doc. - Update the condition for QDM.	2023-07-15 19:11:20 +08:00
Jiaming Yuan	9da5050643	Turn warning messages into Python warnings. (#9387 )	2023-07-15 07:46:43 +08:00
Jiaming Yuan	e964654b8f	[skl] Enable cat feature without specifying tree method. (#9353 )	2023-07-03 22:06:17 +08:00
Jiaming Yuan	39390cc2ee	[breaking] Remove the `predictor` param, allow fallback to prediction using `DMatrix`. (#9129 ) - A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter. - The `predictor` parameter is removed. - Fallback to `DMatrix` when `inplace_predict` is not available. - The heuristic for choosing a predictor is only used during training.	2023-07-03 19:23:54 +08:00
Jiaming Yuan	f4798718c7	Use hist as the default tree method. (#9320 )	2023-06-27 23:04:24 +08:00
Jiaming Yuan	1fcc26a6f8	Set `ndcg` to default for LTR. (#8822 ) - Add document. - Add tests. - Use `ndcg` with `topk` as default.	2023-06-09 23:31:33 +08:00
Jiaming Yuan	e206b899ef	Rework MAP and Pairwise for LTR. (#9075 )	2023-04-28 02:39:12 +08:00
Jiaming Yuan	bac22734fb	Remove ntree limit in python package. (#8345 ) - Remove `ntree_limit`. The parameter has been deprecated since 1.4.0. - The SHAP package compatibility is broken.	2023-03-31 19:01:55 +08:00
Jiaming Yuan	c2b3a13e70	[breaking][skl] Remove parameter serialization. (#8963 ) - Remove parameter serialization in the scikit-learn interface. The scikit-lear interface `save_model` will save only the model and discard all hyper-parameters. This is to align with the native XGBoost interface, which distinguishes the hyper-parameter and model parameters. With the scikit-learn interface, model parameters are attributes of the estimator. For instance, `n_features_in_`, `n_classes_` are always accessible with `estimator.n_features_in_` and `estimator.n_classes_`, but not with the `estimator.get_params`. - Define a `load_model` method for classifier to load its own attributes. - Set n_estimators to None by default.	2023-03-27 21:34:10 +08:00
Jiaming Yuan	5891f752c8	Rework the MAP metric. (#8931 ) - The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1. - Deterministic GPU. (no atomic add). - Fix top-k handling. - Precise definition of MAP. (There are other variants on how to handle top-k). - Refactor GPU ranking tests.	2023-03-22 17:45:20 +08:00
Jiaming Yuan	7eba285a1e	Support sklearn cross validation for ranker. (#8859 ) * Support sklearn cross validation for ranker. - Add a convention for X to include a special `qid` column. sklearn utilities consider only `X`, `y` and `sample_weight` for supervised learning algorithms, but we need an additional qid array for ranking. It's important to be able to support the cross validation function in sklearn since all other tuning functions like grid search are based on cross validation.	2023-03-07 00:22:08 +08:00
Jiaming Yuan	228a46e8ad	Support learning rate for zero-hessian objectives. (#8866 )	2023-03-06 20:33:28 +08:00
Jiaming Yuan	6a892ce281	Specify src path for isort. (#8867 )	2023-03-06 17:30:27 +08:00
Jiaming Yuan	225b3158f6	Support custom metric in sklearn ranker. (#8786 )	2023-02-12 13:14:07 +08:00
BenEfrati	213b5602d9	Add sample_weight to eval_metric (#8706 )	2023-02-05 00:06:38 +08:00
Jiaming Yuan	badeff1d74	Init estimation for regression. (#8272 )	2023-01-11 02:04:56 +08:00
Jiaming Yuan	1b58d81315	[doc] Document Python inputs. (#8643 )	2023-01-10 15:39:32 +08:00
Jiaming Yuan	e68a152d9e	Do not return internal value for `get_params`. (#8634 )	2023-01-05 17:48:26 +08:00
Rong Ou	42e6fbb0db	Fix sklearn test that calls a removed field (#8579 )	2022-12-09 13:06:44 -08:00
Jiaming Yuan	cfd2a9f872	Extract dask and spark test into distributed test. (#8395 ) - Move test files. - Run spark and dask separately to prevent conflicts. - Gather common code into the testing module.	2022-10-28 16:24:32 +08:00
Jiaming Yuan	cf70864fa3	Move Python testing utilities into xgboost module. (#8379 ) - Add typehints. - Fixes for pylint. Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-26 16:56:11 +08:00
Jiaming Yuan	c884b9e888	Validate features for inplace predict. (#8359 )	2022-10-19 23:05:36 +08:00
Jiaming Yuan	2176e511fc	Disable pytest-timeout for now. (#8348 )	2022-10-17 23:06:10 +08:00
Rory Mitchell	ce0382dcb0	[CI] Refactor tests to reduce CI time. (#8312 )	2022-10-12 11:32:06 +02:00
Jiaming Yuan	c70fa502a5	Expose `feature_types` to sklearn interface. (#7821 )	2022-04-21 20:23:35 +08:00
Jiaming Yuan	52d4eda786	Deprecate `use_label_encoder` in XGBClassifier. (#7822 ) * Deprecate `use_label_encoder` in XGBClassifier. * We have removed the encoder, now prepare to remove the indicator.	2022-04-21 13:14:02 +08:00
Jiaming Yuan	3c9b04460a	Move `num_parallel_tree` to model parameter. (#7751 ) The size of forest should be a property of model itself instead of a training hyper-parameter.	2022-03-29 02:32:42 +08:00
Philip Hyunsu Cho	c621775f34	Replace all uses of deprecated function sklearn.datasets.load_boston (#7373 ) * Replace all uses of deprecated function sklearn.datasets.load_boston * More renaming * Fix bad name * Update assertion * Fix n boosted rounds. * Avoid over regularization. * Rebase. * Avoid over regularization. * Whac-a-mole Co-authored-by: fis <jm.yuan@outlook.com>	2022-01-30 04:27:57 -08:00
Philip Hyunsu Cho	b4340abf56	Add special handling for multi:softmax in sklearn predict (#7607 ) * Add special handling for multi:softmax in sklearn predict * Add test coverage	2022-01-29 15:54:49 -08:00
Jiaming Yuan	001503186c	Rewrite approx (#7214 ) This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing. The rewrite has many benefits: - Support for both `max_leaves` and `max_depth`. - Support for `grow_policy`. - Support for mono constraint. - Support for feature weights. - Support for easier bin configuration (`max_bin`). - Support for categorical data. - Faster performance for most of the datasets. (many times faster) - Support for prediction cache. - Significantly better performance for external memory. - Unites the code base between approx and hist.	2022-01-10 21:15:05 +08:00
Jiaming Yuan	eb1efb54b5	Define `feature_names_in_`. (#7526 ) * Define `feature_names_in_`. * Raise attribute error if it's not defined.	2022-01-05 01:35:34 +08:00
Jiaming Yuan	8f0a42a266	Initial support for multi-label classification. (#7521 ) * Add support in sklearn classifier.	2022-01-04 23:58:21 +08:00
Jiaming Yuan	58a6723eb1	Initial support for multioutput regression. (#7514 ) * Add num target model parameter, which is configured from input labels. * Change elementwise metric and indexing for weights. * Add demo. * Add tests.	2021-12-18 09:28:38 +08:00
Jiaming Yuan	8cc75f1576	Cleanup Python tests. (#7426 )	2021-11-14 15:47:05 +08:00

1 2 3

134 Commits