xgboost

Author	SHA1	Message	Date
Jiaming Yuan	c2b3a13e70	[breaking][skl] Remove parameter serialization. (#8963 ) - Remove parameter serialization in the scikit-learn interface. The scikit-lear interface `save_model` will save only the model and discard all hyper-parameters. This is to align with the native XGBoost interface, which distinguishes the hyper-parameter and model parameters. With the scikit-learn interface, model parameters are attributes of the estimator. For instance, `n_features_in_`, `n_classes_` are always accessible with `estimator.n_features_in_` and `estimator.n_classes_`, but not with the `estimator.get_params`. - Define a `load_model` method for classifier to load its own attributes. - Set n_estimators to None by default.	2023-03-27 21:34:10 +08:00
Jiaming Yuan	21a52c7f98	[doc] Add introduction and notes for the sklearn interface. (#8948 )	2023-03-23 13:30:42 +08:00
Jiaming Yuan	bf88dadb61	[doc] Fix callback example. (#8944 )	2023-03-23 03:27:04 +08:00
Jiaming Yuan	151882dd26	Initial support for multi-target tree. (#8616 ) * Implement multi-target for hist. - Add new hist tree builder. - Move data fetchers for tests. - Dispatch function calls in gbm base on the tree type.	2023-03-22 23:49:56 +08:00
Jiaming Yuan	5891f752c8	Rework the MAP metric. (#8931 ) - The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1. - Deterministic GPU. (no atomic add). - Fix top-k handling. - Precise definition of MAP. (There are other variants on how to handle top-k). - Refactor GPU ranking tests.	2023-03-22 17:45:20 +08:00
Jiaming Yuan	f186c87cf9	Check inf in data for all types of DMatrix. (#8911 )	2023-03-15 11:24:35 +08:00
Jiaming Yuan	7eba285a1e	Support sklearn cross validation for ranker. (#8859 ) * Support sklearn cross validation for ranker. - Add a convention for X to include a special `qid` column. sklearn utilities consider only `X`, `y` and `sample_weight` for supervised learning algorithms, but we need an additional qid array for ranking. It's important to be able to support the cross validation function in sklearn since all other tuning functions like grid search are based on cross validation.	2023-03-07 00:22:08 +08:00
Jiaming Yuan	6a892ce281	Specify src path for isort. (#8867 )	2023-03-06 17:30:27 +08:00
mzzhang95	6cef9a08e9	[pyspark] Update eval_metric validation to support list of strings (#8826 )	2023-03-02 08:24:12 +08:00
Jiaming Yuan	cce4af4acf	Initial support for quantile loss. (#8750 ) - Add support for Python. - Add objective.	2023-02-16 02:30:18 +08:00
WeichenXu	f27a7258c6	Fix feature types param (#8772 ) Signed-off-by: Weichen Xu <weichen.xu@databricks.com>	2023-02-14 02:16:42 +08:00
Jiaming Yuan	457f704e3d	Add quantile metric. (#8761 )	2023-02-13 19:07:40 +08:00
Jiaming Yuan	225b3158f6	Support custom metric in sklearn ranker. (#8786 )	2023-02-12 13:14:07 +08:00
Jiaming Yuan	8a16944664	Fix ranking with quantile dmatrix and group weight. (#8762 )	2023-02-10 20:32:35 +08:00
Jiaming Yuan	c4802bfcd0	Cleanup booster param types. (#8756 )	2023-02-07 15:52:19 +08:00
Jiaming Yuan	0f37a01dd9	Require black formatter for the python package. (#8748 )	2023-02-07 01:53:33 +08:00
Jiaming Yuan	a2e433a089	Fix empty DMatrix with categorical features. (#8739 )	2023-02-07 00:40:11 +08:00
Jiaming Yuan	c1786849e3	Use array interface for CSC matrix. (#8672 ) * Use array interface for CSC matrix. Use array interface for CSC matrix and align the interface with CSR and dense. - Fix nthread issue in the R package DMatrix. - Unify the behavior of handling `missing` with other inputs. - Unify the behavior of handling `missing` around R, Python, Java, and Scala DMatrix. - Expose `num_non_missing` to the JVM interface. - Deprecate old CSR and CSC constructors.	2023-02-05 01:59:46 +08:00
BenEfrati	213b5602d9	Add sample_weight to eval_metric (#8706 )	2023-02-05 00:06:38 +08:00
Jiaming Yuan	0e61ba57d6	Fix GPU L1 error. (#8749 )	2023-02-04 03:02:00 +08:00
Jiaming Yuan	1325ba9251	Support primitive types of pyarrow-backed pandas dataframe. (#8653 ) Categorical data (dictionary) is not supported at the moment.	2023-01-30 17:53:29 +08:00
Jiaming Yuan	9fb12b20a4	Cleanup the callback module. (#8702 ) - Cleanup pylint markers. - run formatter. - Update examples of using callback.	2023-01-22 00:13:49 +08:00
James Lamb	6933240837	[python-package] remove unused functions in xgboost.data (#8695 )	2023-01-19 08:02:54 +08:00
Jiaming Yuan	31b9cbab3d	Make sure input numpy array is aligned. (#8690 ) - use `np.require` to specify that the alignment is required. - scipy csr as well. - validate input pointer in `ArrayInterface`.	2023-01-18 08:12:13 +08:00
Jiaming Yuan	175986b739	[doc] Add missing document for pyspark ranker. [skip ci] (#8692 )	2023-01-18 07:52:18 +08:00
Jiaming Yuan	247946a875	Cache transformed in QuantileDMatrix for efficiency. (#8666 )	2023-01-17 06:02:40 +08:00
Jiaming Yuan	d6018eb4b9	Remove all use of `DeviceQuantileDMatrix`. (#8665 )	2023-01-17 00:04:10 +08:00
Bobby Wang	72ec0c5484	[pyspark] support pred_contribs (#8633 )	2023-01-11 16:51:12 +08:00
Jiaming Yuan	cfa994d57f	Multi-target support for L1 error. (#8652 ) - Add matrix support to the median function. - Iterate through each target for quantile computation.	2023-01-11 05:51:14 +08:00
Jiaming Yuan	badeff1d74	Init estimation for regression. (#8272 )	2023-01-11 02:04:56 +08:00
Jiaming Yuan	1b58d81315	[doc] Document Python inputs. (#8643 )	2023-01-10 15:39:32 +08:00
Jiaming Yuan	e68a152d9e	Do not return internal value for `get_params`. (#8634 )	2023-01-05 17:48:26 +08:00
Bobby Wang	d3ad0524e7	[pyspark] Re-work _fit function (#8630 )	2023-01-04 18:21:57 +08:00
Rong Ou	3ceeb8c61c	Add data split mode to DMatrix MetaInfo (#8568 )	2022-12-25 20:37:37 +08:00
Rong Ou	77b069c25d	Support bitwise allreduce operations in the communicator (#8623 )	2022-12-25 06:40:05 +08:00
Jiaming Yuan	c430ae52f3	Fix mypy errors with the latest numpy. (#8617 )	2022-12-21 01:42:05 -08:00
Jiaming Yuan	f6effa1734	Support `Series` and Python primitives in `inplace_predict` and QDM (#8547 )	2022-12-17 00:15:15 +08:00
Jiaming Yuan	001e663d42	Set `enable_categorical` to True in predict. (#8592 )	2022-12-15 05:27:06 +08:00
James Lamb	06ea6c7e79	[python] remove unnecessary conversions between data structures (#8546 ) Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2022-12-14 18:32:02 +08:00
Jiaming Yuan	40343c8ee1	Test dask demos. (#8557 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-12-13 18:37:31 +08:00
Jiaming Yuan	deb3edf562	Support list and tuple for QDM. (#8542 )	2022-12-10 01:14:44 +08:00
Bobby Wang	40a1a2ffa8	[pyspark] check use_qdm across all the workers (#8496 )	2022-12-08 18:09:17 +08:00
Gianfrancesco Angelini	5540019373	feat(py, plot_importance): + values_format as arg (#8540 )	2022-12-08 00:47:28 +08:00
Matthew Rocklin	b7ffdcdbb9	Properly await async method client.wait_for_workers (#8558 ) * Properly await async method client.wait_for_workers * ignore mypy error. Co-authored-by: jiamingy <jm.yuan@outlook.com>	2022-12-07 21:49:30 +08:00
Bobby Wang	8e41ad24f5	[pyspark] sort qid for SparkRanker (#8497 ) * [pyspark] sort qid for SparkRandker * resolve comments	2022-12-01 16:40:35 -08:00
Jiaming Yuan	d666ba775e	Support all pandas nullable integer types. (#8480 ) - Enumerate all pandas integer types. - Tests for `None`, `nan`, and `pd.NA`	2022-11-28 22:38:16 +08:00
Jiaming Yuan	f2209c1fe4	Don't shuffle columns in categorical tests. (#8446 )	2022-11-28 20:28:06 +08:00
WeichenXu	67ea1c3435	[pyspark] Make QDM optional based on cuDF check (#8471 )	2022-11-27 14:58:54 +08:00
Jiaming Yuan	8f97c92541	Support half type for pandas. (#8481 )	2022-11-24 12:47:40 +08:00
Jiaming Yuan	e07245f110	Take datatable as row major input. (#8472 ) * Take datatable as row major input. Try to avoid a transform with dense table.	2022-11-24 09:20:13 +08:00

... 2 3 4 5 6 ...

955 Commits