xgboost

Author	SHA1	Message	Date
Jiaming Yuan	1cf4d93246	Convert federated tests into test suite. (#9006 ) - Add specialization for learning to rank.	2023-04-04 01:29:47 +08:00
Rong Ou	15e073ca9d	Make objectives work with vertical distributed and federated learning (#9002 )	2023-04-03 17:07:42 +08:00
Jiaming Yuan	bac22734fb	Remove ntree limit in python package. (#8345 ) - Remove `ntree_limit`. The parameter has been deprecated since 1.4.0. - The SHAP package compatibility is broken.	2023-03-31 19:01:55 +08:00
Jiaming Yuan	d062a9e009	Define pair generation strategies for LTR. (#8984 )	2023-03-30 12:00:35 +08:00
Philip Hyunsu Cho	6676c28cbc	[CI] Fix Windows wheel to be compatible with Poetry (#8991 ) * [CI] Fix Windows wheel to be compatible with Poetry * Typo * Eagerly scan globs to avoid patching same file twice	2023-03-28 21:32:54 -07:00
Rong Ou	ff26cd3212	More tests for column split and vertical federated learning (#8985 ) Added some more tests for the learner and fit_stump, for both column-wise distributed learning and vertical federated learning. Also moved the `IsRowSplit` and `IsColumnSplit` methods from the `DMatrix` to the `MetaInfo` since in some places we only have access to the `MetaInfo`. Added a new convenience method `IsVerticalFederatedLearning`. Some refactoring of the testing fixtures.	2023-03-28 16:40:26 +08:00
Jiaming Yuan	401ce5cf5e	Run linters with the multi output demo. (#8966 )	2023-03-28 00:47:28 +08:00
Jiaming Yuan	acc110c251	[MT-TREE] Support prediction cache and model slicing. (#8968 ) - Fix prediction range. - Support prediction cache in mt-hist. - Support model slicing. - Make the booster a Python iterable by defining `__iter__`. - Cleanup removed/deprecated parameters. - A new field in the output model `iteration_indptr` for pointing to the ranges of trees for each iteration.	2023-03-27 23:10:54 +08:00
Jiaming Yuan	c2b3a13e70	[breaking][skl] Remove parameter serialization. (#8963 ) - Remove parameter serialization in the scikit-learn interface. The scikit-lear interface `save_model` will save only the model and discard all hyper-parameters. This is to align with the native XGBoost interface, which distinguishes the hyper-parameter and model parameters. With the scikit-learn interface, model parameters are attributes of the estimator. For instance, `n_features_in_`, `n_classes_` are always accessible with `estimator.n_features_in_` and `estimator.n_classes_`, but not with the `estimator.get_params`. - Define a `load_model` method for classifier to load its own attributes. - Set n_estimators to None by default.	2023-03-27 21:34:10 +08:00
Jiaming Yuan	151882dd26	Initial support for multi-target tree. (#8616 ) * Implement multi-target for hist. - Add new hist tree builder. - Move data fetchers for tests. - Dispatch function calls in gbm base on the tree type.	2023-03-22 23:49:56 +08:00
Jiaming Yuan	ea04d4c46c	[doc] [dask] Troubleshooting NCCL errors. (#8943 )	2023-03-22 22:17:26 +08:00
Jiaming Yuan	5891f752c8	Rework the MAP metric. (#8931 ) - The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1. - Deterministic GPU. (no atomic add). - Fix top-k handling. - Precise definition of MAP. (There are other variants on how to handle top-k). - Refactor GPU ranking tests.	2023-03-22 17:45:20 +08:00
Rong Ou	b240f055d3	Support vertical federated learning (#8932 )	2023-03-22 14:25:26 +08:00
Jiaming Yuan	a093770f36	Partitioner for multi-target tree. (#8922 )	2023-03-16 18:49:34 +08:00
Jiaming Yuan	26209a42a5	Define git attributes for renormalization. (#8921 )	2023-03-16 02:43:11 +08:00
Jiaming Yuan	f186c87cf9	Check inf in data for all types of DMatrix. (#8911 )	2023-03-15 11:24:35 +08:00
Jiaming Yuan	72e8331eab	Reimplement the NDCG metric. (#8906 ) - Add support for non-exp gain. - Cache the DMatrix object to avoid re-calculating the IDCG. - Make GPU implementation deterministic. (no atomic add)	2023-03-15 03:26:17 +08:00
Jiaming Yuan	8685556af2	Implement hist evaluator for multi-target tree. (#8908 )	2023-03-15 01:42:51 +08:00
Jiaming Yuan	910ce580c8	Clear all cache after model load. (#8904 )	2023-03-14 22:09:36 +08:00
Jiaming Yuan	c400fa1e8d	Predictor for vector leaf. (#8898 )	2023-03-14 19:07:10 +08:00
Jiaming Yuan	8be6095ece	Implement NDCG cache. (#8893 )	2023-03-13 22:16:31 +08:00
Jiaming Yuan	9bade7203a	Remove public access to tree model param. (#8902 ) * Make tree model param a private member. * Number of features and targets are immutable after construction. This is to reduce the number of places where we can run configuration.	2023-03-13 20:55:10 +08:00
Jiaming Yuan	5ba3509dd3	Define multi expand entry. (#8895 )	2023-03-13 19:31:05 +08:00
Jiaming Yuan	3689695d16	[CI] Run RMM gtests. (#8900 ) * [CI] Run RMM gtests. * Update test-cpp-gpu.sh --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-03-12 03:14:31 +08:00
Jiaming Yuan	36a7396658	Replace dmlc any with std any. (#8892 )	2023-03-11 06:11:04 +08:00
Jiaming Yuan	2aa838c75e	Define multi-strategy parameter. (#8890 )	2023-03-11 02:58:01 +08:00
Jiaming Yuan	6deaec8027	Pass obj info by reference instead of by value. (#8889 ) - Pass obj info into tree updater as const pointer. This way we don't have to initialize the learner model param before configuring gbm, hence breaking up the dependency of configurations.	2023-03-11 01:38:28 +08:00
Jiaming Yuan	c5c8f643f2	Remove the cub submodule. (#8888 ) XGBoost now uses CTK-11.8 for binary packages, there's no need to maintain a cub submodule anymore.	2023-03-09 19:43:02 -08:00
Jiaming Yuan	5feee8d4a9	Define core multi-target regression tree structure. (#8884 ) - Define a new tree struct embedded in the `RegTree`. - Provide dispatching functions in `RegTree`. - Fix some c++-17 warnings about the use of nodiscard (currently we disable the warning on the CI). - Use uint32_t instead of size_t for `bst_target_t` as it has a defined size and can be used as part of dmlc parameter. - Hide the `Segment` struct inside the categorical split matrix.	2023-03-09 19:03:06 +08:00
Jiaming Yuan	46dfcc7d22	Define a new ranking parameter. (#8887 )	2023-03-09 17:46:24 +08:00
Jiaming Yuan	f236640427	Support F order for the tensor type. (#8872 ) - Add F order support for tensor and view. - Use parameter pack for automatic type cast. (avoid excessive static cast for shape).	2023-03-08 03:27:49 +08:00
Jiaming Yuan	f7ce0ec0df	Upgrade gcc toolchain to 9.x. (#8878 ) * Use new tool chain. * Use gcc-9. * Use cmake from system. * DOn't link leak.	2023-03-07 08:25:23 -08:00
Jiaming Yuan	7eba285a1e	Support sklearn cross validation for ranker. (#8859 ) * Support sklearn cross validation for ranker. - Add a convention for X to include a special `qid` column. sklearn utilities consider only `X`, `y` and `sample_weight` for supervised learning algorithms, but we need an additional qid array for ranking. It's important to be able to support the cross validation function in sklearn since all other tuning functions like grid search are based on cross validation.	2023-03-07 00:22:08 +08:00
Jiaming Yuan	228a46e8ad	Support learning rate for zero-hessian objectives. (#8866 )	2023-03-06 20:33:28 +08:00
Jiaming Yuan	6a892ce281	Specify src path for isort. (#8867 )	2023-03-06 17:30:27 +08:00
Jiaming Yuan	4d665b3fb0	Restore clang tidy test. (#8861 )	2023-03-03 13:47:04 -08:00
Rory Mitchell	69a50248b7	Fix scope of feature set pointers (#8850 ) --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2023-03-02 12:37:14 +08:00
mzzhang95	6cef9a08e9	[pyspark] Update eval_metric validation to support list of strings (#8826 )	2023-03-02 08:24:12 +08:00
Rong Ou	7cbaee9916	Support column split in `approx` tree method (#8847 )	2023-03-02 03:59:07 +08:00
Philip Hyunsu Cho	6d8afb2218	[CI] Require C++17 + CMake 3.18; Use CUDA 11.8 in CI (#8853 ) * Update to C++17 * Turn off unity build * Update CMake to 3.18 * Use MSVC 2022 + CUDA 11.8 * Re-create stack for worker images * Allocate more disk space for Windows * Tempiorarily disable clang-tidy * RAPIDS now requires Python 3.10+ * Unpin cuda-python * Use latest NCCL * Use Ubuntu 20.04 in RMM image * Mark failing mgpu test as xfail	2023-03-01 09:22:24 -08:00
Jiaming Yuan	d54ef56f6f	Fix cache with gc (#8851 ) - Make DMatrixCache thread-safe. - Remove the use of thread-local memory.	2023-03-01 00:39:06 +08:00
Rong Ou	d9688f93c7	Support column-split in row partitioner (#8828 )	2023-02-26 04:43:35 +08:00
Rong Ou	a65ad0bd9c	Support column split in histogram builder (#8811 )	2023-02-17 22:37:01 +08:00
Jiaming Yuan	c0afdb6786	Fix CPU bin compression with categorical data. (#8809 ) * Fix CPU bin compression with categorical data. * The bug causes the maximum category to be lesser than 256 or the maximum number of bins when the input data is dense.	2023-02-16 04:20:34 +08:00
Jiaming Yuan	cce4af4acf	Initial support for quantile loss. (#8750 ) - Add support for Python. - Add objective.	2023-02-16 02:30:18 +08:00
Jiaming Yuan	282b1729da	Specify the number of threads for parallel sort. (#8735 ) * Specify the number of threads for parallel sort. - Pass context object into argsort. - Replace macros with inline functions.	2023-02-16 00:20:19 +08:00
Jiaming Yuan	81b2ee1153	Pass DMatrix into metric for caching. (#8790 )	2023-02-13 22:15:05 +08:00
Jiaming Yuan	31d3ec07af	Extract device algorithms. (#8789 )	2023-02-13 20:53:53 +08:00
Jiaming Yuan	457f704e3d	Add quantile metric. (#8761 )	2023-02-13 19:07:40 +08:00
Jiaming Yuan	d11a0044cf	Generalize prediction cache. (#8783 ) * Extract most of the functionality into `DMatrixCache`. * Move API entry to independent file to reduce dependency on `predictor.h` file. * Add test. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-02-13 12:36:43 +08:00

1 2 3 4 5 ...

1373 Commits