xgboost

Author	SHA1	Message	Date
Jiaming Yuan	d12cc1090a	Refactor tests for training continuation. (#9997 )	2024-01-24 16:07:19 +08:00
Jiaming Yuan	0798e36d73	[breaking] Remove deprecated parameters in the skl interface. (#9986 )	2024-01-15 20:40:05 +08:00
Jiaming Yuan	2f57bbde3c	Additional tests for attributes and model booosted rounds. (#9962 )	2024-01-09 09:54:39 +08:00
Jiaming Yuan	b3eb5d0945	Use UBJ in Python checkpoint. (#9958 )	2024-01-09 03:22:15 +08:00
Jiaming Yuan	9a30bdd313	Test loading models with invalid file extensions. (#9955 )	2024-01-08 19:26:24 +08:00
Jiaming Yuan	38dd91f491	Save model in ubj as the default. (#9947 )	2024-01-05 17:53:36 +08:00
Jiaming Yuan	c03a4d5088	Check support status for categorical features. (#9946 )	2024-01-04 16:51:33 +08:00
Jiaming Yuan	621348abb3	Fix multi-output with alternating strategies. (#9933 ) --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2024-01-04 16:41:13 +08:00
Jiaming Yuan	5f7b5a6921	Add tests for pickling with custom obj and metric. (#9943 )	2024-01-04 14:52:48 +08:00
Jiaming Yuan	9f73127a23	Cleanup Python GPU tests. (#9934 ) * Cleanup Python GPU tests. - Remove the use of `gpu_hist` and `gpu_id` in cudf/cupy tests. - Move base margin test into the testing directory.	2024-01-04 13:15:18 +08:00
Jiaming Yuan	a7226c0222	Fix feature names with special characters. (#9923 )	2023-12-28 22:45:13 +08:00
Jiaming Yuan	1aa8c8d9be	Support more scipy types. (#9881 )	2023-12-14 18:28:37 +08:00
Jiaming Yuan	faf0f2df10	Support dataframe data format in native XGBoost. (#9828 ) - Implement a columnar adapter. - Refactor Python pandas handling code to avoid converting into a single numpy array. - Add support in R for transforming columns. - Support R data.frame and factor type.	2023-12-12 09:56:31 +08:00
Jiaming Yuan	e9f149481e	[sklearn] Fix loading model attributes. (#9808 )	2023-11-27 17:19:01 +08:00
Jiaming Yuan	c3a0622b49	Fix using categorical data with the score function of ranker. (#9753 )	2023-11-07 07:29:11 +08:00
david-cortes	be20df8c23	[Python] Accept numpy generators as `random_state` (#9743 ) * accept numpy generators for random_state * make linter happy * fix tests	2023-11-01 16:20:44 -07:00
Jiaming Yuan	3ca06ac51e	[doc] Mention data consistency for categorical features. (#9678 )	2023-10-24 10:11:33 +08:00
Rong Ou	6fbe6248f4	More in-memory input support for column split (#9685 )	2023-10-20 16:02:36 +08:00
Rong Ou	da6803b75b	Support column-wise data split with in-memory inputs (#9628 ) --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2023-10-17 12:16:39 +08:00
Jiaming Yuan	60526100e3	Support arrow through pandas ext types. (#9612 ) - Use pandas extension type for pyarrow support. - Additional support for QDM. - Additional support for inplace_predict.	2023-09-28 17:00:16 +08:00
Jiaming Yuan	c75a3bc0a9	[breaking] [jvm-packages] Remove rabit check point. (#9599 ) - Add `numBoostedRound` to jvm packages - Remove rabit checkpoint version. - Change the starting version of training continuation in JVM [breaking]. - Redefine the checkpoint version policy in jvm package. [breaking] - Rename the Python check point callback parameter. [breaking] - Unifies the checkpoint policy between Python and JVM.	2023-09-26 18:06:34 +08:00
Jiaming Yuan	9027686cac	Support pandas 2.1.0. (#9557 )	2023-09-11 17:44:51 +08:00
Jiaming Yuan	ccfc90e4c6	[rabit] Improved connection handling. (#9531 ) - Enable timeout. - Report connection error from the system. - Handle retry for both tracker connection and peer connection.	2023-08-30 13:00:04 +08:00
Jiaming Yuan	209335b18c	Remove the deprecated Python rabit module. (#9523 )	2023-08-27 03:37:05 +08:00
Jiaming Yuan	7f29a238e6	Return base score as intercept. (#9486 )	2023-08-19 12:28:02 +08:00
Jiaming Yuan	19b59938b7	Convert input to str for hypothesis note. (#9480 )	2023-08-15 02:27:58 +08:00
Jiaming Yuan	05d7000096	Handle special characters in JSON model dump. (#9474 )	2023-08-14 15:49:00 +08:00
Jiaming Yuan	801116c307	Test scikit-learn model IO with gblinear. (#9459 )	2023-08-13 23:41:49 +08:00
Jiaming Yuan	f05a23b41c	Use `weakref` instead of `id` for `DataIter` cache. (#9445 ) - Fix case where Python reuses id from freed objects. - Small optimization to column matrix with QDM by using `realloc` instead of copying data.	2023-08-10 00:40:06 +08:00
Philip Hyunsu Cho	7ce090e775	Handle UTF-8 paths correctly on Windows platform (#9443 ) * Fix round-trip serialization with UTF-8 paths * Add compiler version check * Add comment to C API functions * Add Python tests * [CI] Updatre MacOS deployment target * Use std::filesystem instead of dmlc::TemporaryDirectory	2023-08-07 23:27:25 -07:00
Jiaming Yuan	54029a59af	Bound the size of the histogram cache. (#9440 ) - A new histogram collection with a limit in size. - Unify histogram building logic between hist, multi-hist, and approx.	2023-08-08 03:21:26 +08:00
Jiaming Yuan	912e341d57	Initial GPU support for the approx tree method. (#9414 )	2023-07-31 15:50:28 +08:00
Jiaming Yuan	851cba931e	Define `best_iteration` only if early stopping is used. (#9403 ) * Define `best_iteration` only if early stopping is used. This is the behavior specified by the document but not honored in the actual code. - Don't set the attributes if there's no early stopping. - Clean up the code for callbacks, and replace assertions with proper exceptions. - Assign the attributes when early stopping `save_best` is used. - Turn the attributes into Python properties. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-07-24 12:43:35 +08:00
Jiaming Yuan	01e00efc53	[breaking] Remove support for single string feature info. (#9401 ) - Input must be a sequence of strings. - Improve validation error message.	2023-07-24 11:06:30 +08:00
Jiaming Yuan	16eb41936d	Handle the new `device` parameter in dask and demos. (#9386 ) * Handle the new `device` parameter in dask and demos. - Check no ordinal is specified in the dask interface. - Update demos. - Update dask doc. - Update the condition for QDM.	2023-07-15 19:11:20 +08:00
Jiaming Yuan	9da5050643	Turn warning messages into Python warnings. (#9387 )	2023-07-15 07:46:43 +08:00
Jiaming Yuan	04aff3af8e	Define the new `device` parameter. (#9362 )	2023-07-13 19:30:25 +08:00
Jiaming Yuan	97ed944209	Unify the hist tree method for different devices. (#9363 )	2023-07-11 10:04:39 +08:00
Jiaming Yuan	20c52f07d2	Support exporting cut values (#9356 )	2023-07-08 15:32:41 +08:00
Jiaming Yuan	e964654b8f	[skl] Enable cat feature without specifying tree method. (#9353 )	2023-07-03 22:06:17 +08:00
Jiaming Yuan	39390cc2ee	[breaking] Remove the `predictor` param, allow fallback to prediction using `DMatrix`. (#9129 ) - A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter. - The `predictor` parameter is removed. - Fallback to `DMatrix` when `inplace_predict` is not available. - The heuristic for choosing a predictor is only used during training.	2023-07-03 19:23:54 +08:00
Jiaming Yuan	f4798718c7	Use hist as the default tree method. (#9320 )	2023-06-27 23:04:24 +08:00
Jiaming Yuan	6d22ea793c	Test QDM with sparse data on CPU. (#9316 )	2023-06-19 21:27:03 +08:00
Jiaming Yuan	ee6809e642	Use mmap for external memory. (#9282 ) - Have basic infrastructure for mmap. - Release file write handle.	2023-06-19 18:52:55 +08:00
Jiaming Yuan	1fcc26a6f8	Set `ndcg` to default for LTR. (#8822 ) - Add document. - Add tests. - Use `ndcg` with `topk` as default.	2023-06-09 23:31:33 +08:00
Jiaming Yuan	9fbde21e9d	Rework the precision metric. (#9222 ) - Rework the precision metric for both CPU and GPU. - Mention it in the document. - Cleanup old support code for GPU ranking metric. - Deterministic GPU implementation. * Drop support for classification. * type. * use batch shape. * lint. * cpu build. * cpu build. * lint. * Tests. * Fix. * Cleanup error message.	2023-06-02 20:49:43 +08:00
Jiaming Yuan	3913ff470f	Import data lazily during tests. (#9176 )	2023-05-23 03:58:31 +08:00
Jiaming Yuan	1f9a57d17b	[Breaking] Require format to be specified in input URI. (#9077 ) Previously, we use `libsvm` as default when format is not specified. However, the dmlc data parser is not particularly robust against errors, and the most common type of error is undefined format. Along with which, we will recommend users to use other data loader instead. We will continue the maintenance of the parsers as it's currently used for many internal tests including federated learning.	2023-04-28 19:45:15 +08:00
Jiaming Yuan	e206b899ef	Rework MAP and Pairwise for LTR. (#9075 )	2023-04-28 02:39:12 +08:00
Jiaming Yuan	2c8d735cb3	Fix tests with pandas 2.0. (#9014 ) * Fix tests with pandas 2.0. - `is_categorical` is replaced by `is_categorical_dtype`. - one hot encoding returns boolean type instead of integer type.	2023-04-11 00:17:34 +08:00

1 2 3 4 5 ...

566 Commits