xgboost

Author	SHA1	Message	Date
Jiaming Yuan	e228c1a121	[EM] Make page concatenation optional. (#10826 ) This PR introduces a new parameter `extmem_concat_pages` to make the page concatenation optional for GPU hist. In addition, the document is updated for the new GPU-based external memory.	2024-09-24 06:19:28 +08:00
Jiaming Yuan	ed5f33df16	[EM] Multi-level quantile sketching for GPU. (#10813 )	2024-09-10 13:08:34 +08:00
Jiaming Yuan	827d0e8edb	[breaking] Bump Python requirement to 3.10. (#10434 ) - Bump the Python requirement. - Fix type hints. - Use loky to avoid deadlock. - Workaround cupy-numpy compatibility issue on Windows caused by the `safe` casting rule. - Simplify the repartitioning logic to avoid dask errors.	2024-07-30 17:31:06 +08:00
david-cortes	8d0f2bfbaa	[doc] Add more detailed explanations for advanced objectives (#10283 ) --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2024-07-08 19:17:31 +08:00
Jiaming Yuan	b4cc350ec5	Fix categorical data with external memory. (#10433 )	2024-06-18 04:34:54 +08:00
Jiaming Yuan	73afef1a6e	Fixes for numpy 2.0. (#10252 )	2024-05-07 03:54:32 +08:00
Jiaming Yuan	54b71c8fba	Fix with black 24.1.1. (#10014 )	2024-01-30 17:24:11 +08:00
Jiaming Yuan	d07e8b503e	Fix quantile regression demo. (#9991 )	2024-01-17 13:19:08 +08:00
Jiaming Yuan	0798e36d73	[breaking] Remove deprecated parameters in the skl interface. (#9986 )	2024-01-15 20:40:05 +08:00
Jiaming Yuan	b3eb5d0945	Use UBJ in Python checkpoint. (#9958 )	2024-01-09 03:22:15 +08:00
Jiaming Yuan	9f73127a23	Cleanup Python GPU tests. (#9934 ) * Cleanup Python GPU tests. - Remove the use of `gpu_hist` and `gpu_id` in cudf/cupy tests. - Move base margin test into the testing directory.	2024-01-04 13:15:18 +08:00
Jiaming Yuan	faf0f2df10	Support dataframe data format in native XGBoost. (#9828 ) - Implement a columnar adapter. - Refactor Python pandas handling code to avoid converting into a single numpy array. - Add support in R for transforming columns. - Support R data.frame and factor type.	2023-12-12 09:56:31 +08:00
david-cortes	2c0fc97306	Remove note about multi-quantile being python-only (#9854 )	2023-12-07 05:17:15 +08:00
Jiaming Yuan	3ca06ac51e	[doc] Mention data consistency for categorical features. (#9678 )	2023-10-24 10:11:33 +08:00
Jiaming Yuan	c75a3bc0a9	[breaking] [jvm-packages] Remove rabit check point. (#9599 ) - Add `numBoostedRound` to jvm packages - Remove rabit checkpoint version. - Change the starting version of training continuation in JVM [breaking]. - Redefine the checkpoint version policy in jvm package. [breaking] - Rename the Python check point callback parameter. [breaking] - Unifies the checkpoint policy between Python and JVM.	2023-09-26 18:06:34 +08:00
Jiaming Yuan	972730cde0	Use matrix for gradient. (#9508 ) - Use the `linalg::Matrix` for storing gradients. - New API for the custom objective. - Custom objective for multi-class/multi-target is now required to return the correct shape. - Custom objective for Python can accept arrays with any strides. (row-major, column-major)	2023-08-24 05:29:52 +08:00
Jiaming Yuan	fd4335d0bf	[doc] Document the current status of some features. (#9469 )	2023-08-13 23:42:27 +08:00
Jiaming Yuan	f05a23b41c	Use `weakref` instead of `id` for `DataIter` cache. (#9445 ) - Fix case where Python reuses id from freed objects. - Small optimization to column matrix with QDM by using `realloc` instead of copying data.	2023-08-10 00:40:06 +08:00
Jiaming Yuan	851cba931e	Define `best_iteration` only if early stopping is used. (#9403 ) * Define `best_iteration` only if early stopping is used. This is the behavior specified by the document but not honored in the actual code. - Don't set the attributes if there's no early stopping. - Clean up the code for callbacks, and replace assertions with proper exceptions. - Assign the attributes when early stopping `save_best` is used. - Turn the attributes into Python properties. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-07-24 12:43:35 +08:00
Jiaming Yuan	16eb41936d	Handle the new `device` parameter in dask and demos. (#9386 ) * Handle the new `device` parameter in dask and demos. - Check no ordinal is specified in the dask interface. - Update demos. - Update dask doc. - Update the condition for QDM.	2023-07-15 19:11:20 +08:00
Jiaming Yuan	ee6809e642	Use mmap for external memory. (#9282 ) - Have basic infrastructure for mmap. - Release file write handle.	2023-06-19 18:52:55 +08:00
Jiaming Yuan	1fcc26a6f8	Set `ndcg` to default for LTR. (#8822 ) - Add document. - Add tests. - Use `ndcg` with `topk` as default.	2023-06-09 23:31:33 +08:00
Jiaming Yuan	1f9a57d17b	[Breaking] Require format to be specified in input URI. (#9077 ) Previously, we use `libsvm` as default when format is not specified. However, the dmlc data parser is not particularly robust against errors, and the most common type of error is undefined format. Along with which, we will recommend users to use other data loader instead. We will continue the maintenance of the parsers as it's currently used for many internal tests including federated learning.	2023-04-28 19:45:15 +08:00
Jiaming Yuan	720a8c3273	[doc] Remove parameter type in Python doc strings. (#9005 )	2023-04-01 04:04:30 +08:00
Jiaming Yuan	401ce5cf5e	Run linters with the multi output demo. (#8966 )	2023-03-28 00:47:28 +08:00
Jiaming Yuan	21a52c7f98	[doc] Add introduction and notes for the sklearn interface. (#8948 )	2023-03-23 13:30:42 +08:00
Jiaming Yuan	151882dd26	Initial support for multi-target tree. (#8616 ) * Implement multi-target for hist. - Add new hist tree builder. - Move data fetchers for tests. - Dispatch function calls in gbm base on the tree type.	2023-03-22 23:49:56 +08:00
Jiaming Yuan	228a46e8ad	Support learning rate for zero-hessian objectives. (#8866 )	2023-03-06 20:33:28 +08:00
Jiaming Yuan	6a892ce281	Specify src path for isort. (#8867 )	2023-03-06 17:30:27 +08:00
Jiaming Yuan	cce4af4acf	Initial support for quantile loss. (#8750 ) - Add support for Python. - Add objective.	2023-02-16 02:30:18 +08:00
Jiaming Yuan	e9c178f402	[doc] Document update [skip ci] (#8784 ) - Remove version specifics in cat demo. - Remove aws yarn. - Update faq. - Stop mentioning MPI. - Update sphinx inventory links. - Fix typo.	2023-02-12 04:25:22 +08:00
Jiaming Yuan	7b3d473593	[doc] Add demo for inference using individual tree. (#8752 )	2023-02-07 04:40:18 +08:00
Jiaming Yuan	d6018eb4b9	Remove all use of `DeviceQuantileDMatrix`. (#8665 )	2023-01-17 00:04:10 +08:00
Jiaming Yuan	badeff1d74	Init estimation for regression. (#8272 )	2023-01-11 02:04:56 +08:00
Jiaming Yuan	0d3da9869c	Require isort on all Python files. (#8420 )	2022-11-08 12:59:06 +08:00
Rory Mitchell	ce0382dcb0	[CI] Refactor tests to reduce CI time. (#8312 )	2022-10-12 11:32:06 +02:00
Jiaming Yuan	570f8ae4ba	Use black on more Python files. (#8137 )	2022-08-11 01:38:11 +08:00
Praateek Mahajan	ff471b3fab	In PySpark Estimator example use the model with validation_indicator (#8131 ) * use the validation_indicator model * use the validation_indicator model for regression	2022-08-03 13:57:41 +08:00
WeichenXu	f23cc92130	[pyspark] User guide doc and tutorials (#8082 ) Co-authored-by: Bobby Wang <wbo4958@gmail.com>	2022-07-19 22:25:14 +08:00
Jiaming Yuan	52d4eda786	Deprecate `use_label_encoder` in XGBClassifier. (#7822 ) * Deprecate `use_label_encoder` in XGBClassifier. * We have removed the encoder, now prepare to remove the indicator.	2022-04-21 13:14:02 +08:00
Jiaming Yuan	bcce17e688	Remove text loading in basic walk through demo. (#7753 )	2022-04-01 00:59:42 +08:00
Jiaming Yuan	4d81c741e9	External memory support for hist (#7531 ) * Generate column matrix from gHistIndex. * Avoid synchronization with the sparse page once the cache is written. * Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist. * Remove pruner.	2022-03-22 00:13:20 +08:00
Jiaming Yuan	cd55823112	Demo for using custom objective with multi-target regression. (#7736 )	2022-03-20 17:44:25 +08:00
Jiaming Yuan	1d468e20a4	Optimize GPU evaluation function for categorical data. (#7705 ) * Use transform and cache.	2022-02-28 17:46:29 +08:00
Jiaming Yuan	0da7d872ef	[doc] Update for prediction. (#7648 )	2022-02-15 05:01:55 +08:00
Jiaming Yuan	0d0abe1845	Support optimal partitioning for GPU hist. (#7652 ) * Implement `MaxCategory` in quantile. * Implement partition-based split for GPU evaluation. Currently, it's based on the existing evaluation function. * Extract an evaluator from GPU Hist to store the needed states. * Added some CUDA stream/event utilities. * Update document with references. * Fixed a bug in approx evaluator where the number of data points is less than the number of categories.	2022-02-15 03:03:12 +08:00
Philip Hyunsu Cho	c621775f34	Replace all uses of deprecated function sklearn.datasets.load_boston (#7373 ) * Replace all uses of deprecated function sklearn.datasets.load_boston * More renaming * Fix bad name * Update assertion * Fix n boosted rounds. * Avoid over regularization. * Rebase. * Avoid over regularization. * Whac-a-mole Co-authored-by: fis <jm.yuan@outlook.com>	2022-01-30 04:27:57 -08:00
Jiaming Yuan	b4ec1682c6	Update document for multi output and categorical. (#7574 ) * Group together categorical related parameters. * Update documents about multioutput and categorical.	2022-01-19 04:35:17 +08:00
Jiaming Yuan	001503186c	Rewrite approx (#7214 ) This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing. The rewrite has many benefits: - Support for both `max_leaves` and `max_depth`. - Support for `grow_policy`. - Support for mono constraint. - Support for feature weights. - Support for easier bin configuration (`max_bin`). - Support for categorical data. - Faster performance for most of the datasets. (many times faster) - Support for prediction cache. - Significantly better performance for external memory. - Unites the code base between approx and hist.	2022-01-10 21:15:05 +08:00
Jiaming Yuan	58a6723eb1	Initial support for multioutput regression. (#7514 ) * Add num target model parameter, which is configured from input labels. * Change elementwise metric and indexing for weights. * Add demo. * Add tests.	2021-12-18 09:28:38 +08:00

1 2 3

127 Commits