xgboost

Author	SHA1	Message	Date
Jiaming Yuan	41c6813496	Preserve order of saved updaters config. (#9355 ) - Save the updater sequence as an array instead of object. - Warn only once. The compatibility is kept, but we should be able to break it as the config is not loaded in pickle model and it's declared to be not stable.	2023-07-05 20:20:07 +08:00
Jiaming Yuan	645037e376	Improve test coverage with predictor configuration. (#9354 ) * Improve test coverage with predictor configuration. - Test with ext memory. - Test with QDM. - Test with dart.	2023-07-05 15:17:22 +08:00
Jiaming Yuan	d0916849a6	Remove unused weight from buffer for cat features. (#9341 )	2023-07-04 01:07:09 +08:00
Jiaming Yuan	39390cc2ee	[breaking] Remove the `predictor` param, allow fallback to prediction using `DMatrix`. (#9129 ) - A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter. - The `predictor` parameter is removed. - Fallback to `DMatrix` when `inplace_predict` is not available. - The heuristic for choosing a predictor is only used during training.	2023-07-03 19:23:54 +08:00
Jiaming Yuan	bc267dd729	Use ptr from `mmap` for `GHistIndexMatrix` and `ColumnMatrix`. (#9315 ) * Use ptr from mmap for `GHistIndexMatrix` and `ColumnMatrix`. - Define a resource for holding various types of memory pointers. - Define ref vector for holding resources. - Swap the underlying resources for GHist and ColumnM. - Add documentation for current status. - s390x support is removed. It should work if you can compile XGBoost, all the old workaround code does is to get GCC to compile.	2023-06-27 19:05:46 +08:00
Jiaming Yuan	54da4b3185	Cleanup to prepare for using mmap pointer in external memory. (#9317 ) - Update SparseDMatrix comment. - Use a pointer in the bitfield. We will replace the `std::vector<bool>` in `ColumnMatrix` with bitfield. - Clean up the page source. The timer is removed as it's inaccurate once we swap the mmap pointer into the page.	2023-06-22 06:43:11 +08:00
Jiaming Yuan	ee6809e642	Use mmap for external memory. (#9282 ) - Have basic infrastructure for mmap. - Release file write handle.	2023-06-19 18:52:55 +08:00
Rong Ou	e70810be8a	Refactor device communicator to make allreduce more flexible (#9295 )	2023-06-14 03:53:03 +08:00
Jiaming Yuan	9fbde21e9d	Rework the precision metric. (#9222 ) - Rework the precision metric for both CPU and GPU. - Mention it in the document. - Cleanup old support code for GPU ranking metric. - Deterministic GPU implementation. * Drop support for classification. * type. * use batch shape. * lint. * cpu build. * cpu build. * lint. * Tests. * Fix. * Cleanup error message.	2023-06-02 20:49:43 +08:00
Jiaming Yuan	17fd3f55e9	Optimize adapter element counting on GPU. (#9209 ) - Implement a simple `IterSpan` for passing iterators with size. - Use shared memory for column size counts. - Use one thread for each sample in row count to reduce atomic operations.	2023-05-30 23:28:43 +08:00
Jiaming Yuan	ae7450ce54	Skip optional synchronization in thrust. (#9212 )	2023-05-30 17:23:09 +08:00
Jiaming Yuan	03bc6e6427	Remove unused variables. (#9210 ) - remove used variables. - Remove signed comparison warnings.	2023-05-28 05:24:15 +08:00
Rong Ou	5b69534b43	Support column split in multi-target `hist` (#9171 )	2023-05-26 16:56:05 +08:00
Stephan T. Lavavej	7375bd058b	Fix IndexTransformIter. (#9155 )	2023-05-12 21:25:54 +08:00
Rong Ou	603f8ce2fa	Support `hist` in the partition builder under column split (#9120 )	2023-05-11 05:24:29 +08:00
Jiaming Yuan	85988a3178	Wait for data CUDA stream instead of sync. (#9144 ) --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-05-09 09:52:21 +08:00
Jiaming Yuan	08ce495b5d	Use Booster context in DMatrix. (#8896 ) - Pass context from booster to DMatrix. - Use context instead of integer for `n_threads`. - Check the consistency configuration for `max_bin`. - Test for all combinations of initialization options.	2023-04-28 21:47:14 +08:00
Jiaming Yuan	17ff471616	Optimize array interface input. (#9090 )	2023-04-28 18:01:58 +08:00
Rong Ou	a320b402a5	More refactoring to take advantage of collective aggregators (#9081 )	2023-04-26 03:36:09 +08:00
Jiaming Yuan	ef13dd31b1	Rework the NDCG objective. (#9015 )	2023-04-18 21:16:06 +08:00
Jiaming Yuan	d062a9e009	Define pair generation strategies for LTR. (#8984 )	2023-03-30 12:00:35 +08:00
Rong Ou	ff26cd3212	More tests for column split and vertical federated learning (#8985 ) Added some more tests for the learner and fit_stump, for both column-wise distributed learning and vertical federated learning. Also moved the `IsRowSplit` and `IsColumnSplit` methods from the `DMatrix` to the `MetaInfo` since in some places we only have access to the `MetaInfo`. Added a new convenience method `IsVerticalFederatedLearning`. Some refactoring of the testing fixtures.	2023-03-28 16:40:26 +08:00
Jiaming Yuan	151882dd26	Initial support for multi-target tree. (#8616 ) * Implement multi-target for hist. - Add new hist tree builder. - Move data fetchers for tests. - Dispatch function calls in gbm base on the tree type.	2023-03-22 23:49:56 +08:00
Jiaming Yuan	ea04d4c46c	[doc] [dask] Troubleshooting NCCL errors. (#8943 )	2023-03-22 22:17:26 +08:00
Jiaming Yuan	a05799ed39	Specify char type in JSON. (#8949 ) char is defined as signed on x86 but unsigned on arm64 - Use `std::int8_t` instead of char. - Fix include when clang is pretending to be gcc.	2023-03-22 19:13:44 +08:00
Jiaming Yuan	5891f752c8	Rework the MAP metric. (#8931 ) - The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1. - Deterministic GPU. (no atomic add). - Fix top-k handling. - Precise definition of MAP. (There are other variants on how to handle top-k). - Refactor GPU ranking tests.	2023-03-22 17:45:20 +08:00
Jiaming Yuan	a093770f36	Partitioner for multi-target tree. (#8922 )	2023-03-16 18:49:34 +08:00
Jiaming Yuan	26209a42a5	Define git attributes for renormalization. (#8921 )	2023-03-16 02:43:11 +08:00
Jiaming Yuan	f186c87cf9	Check inf in data for all types of DMatrix. (#8911 )	2023-03-15 11:24:35 +08:00
Jiaming Yuan	8685556af2	Implement hist evaluator for multi-target tree. (#8908 )	2023-03-15 01:42:51 +08:00
Jiaming Yuan	8be6095ece	Implement NDCG cache. (#8893 )	2023-03-13 22:16:31 +08:00
Jiaming Yuan	5feee8d4a9	Define core multi-target regression tree structure. (#8884 ) - Define a new tree struct embedded in the `RegTree`. - Provide dispatching functions in `RegTree`. - Fix some c++-17 warnings about the use of nodiscard (currently we disable the warning on the CI). - Use uint32_t instead of size_t for `bst_target_t` as it has a defined size and can be used as part of dmlc parameter. - Hide the `Segment` struct inside the categorical split matrix.	2023-03-09 19:03:06 +08:00
Jiaming Yuan	46dfcc7d22	Define a new ranking parameter. (#8887 )	2023-03-09 17:46:24 +08:00
Jiaming Yuan	cad7401783	Disable gcc parallel extension if openmp is not available. (#8871 ) `<parallel/algorithm>` internally includes the <omp.h> header, which leads to an error when openmp is not available.	2023-03-06 22:51:06 +08:00
Jiaming Yuan	4d665b3fb0	Restore clang tidy test. (#8861 )	2023-03-03 13:47:04 -08:00
Rong Ou	2dc22e7aad	Take advantage of C++17 features (#8858 ) --------- Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2023-03-04 00:24:13 +08:00
Rong Ou	d9688f93c7	Support column-split in row partitioner (#8828 )	2023-02-26 04:43:35 +08:00
Rong Ou	a65ad0bd9c	Support column split in histogram builder (#8811 )	2023-02-17 22:37:01 +08:00
Jiaming Yuan	c0afdb6786	Fix CPU bin compression with categorical data. (#8809 ) * Fix CPU bin compression with categorical data. * The bug causes the maximum category to be lesser than 256 or the maximum number of bins when the input data is dense.	2023-02-16 04:20:34 +08:00
Jiaming Yuan	cce4af4acf	Initial support for quantile loss. (#8750 ) - Add support for Python. - Add objective.	2023-02-16 02:30:18 +08:00
Jiaming Yuan	282b1729da	Specify the number of threads for parallel sort. (#8735 ) * Specify the number of threads for parallel sort. - Pass context object into argsort. - Replace macros with inline functions.	2023-02-16 00:20:19 +08:00
Jiaming Yuan	31d3ec07af	Extract device algorithms. (#8789 )	2023-02-13 20:53:53 +08:00
Jiaming Yuan	457f704e3d	Add quantile metric. (#8761 )	2023-02-13 19:07:40 +08:00
Jiaming Yuan	d11a0044cf	Generalize prediction cache. (#8783 ) * Extract most of the functionality into `DMatrixCache`. * Move API entry to independent file to reduce dependency on `predictor.h` file. * Add test. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-02-13 12:36:43 +08:00
Jiaming Yuan	17b709acb9	Rename ranking utils to threading utils. (#8785 )	2023-02-12 05:41:18 +08:00
Jiaming Yuan	70c9b885ef	Extract floating point rounding routines. (#8771 )	2023-02-12 04:26:41 +08:00
Jiaming Yuan	5f76edd296	Extract make metric name from ranking metric. (#8768 ) - Extract the metric parsing routine from ranking. - Add a test. - Accept null for string view.	2023-02-09 18:30:21 +08:00
Jiaming Yuan	48cefa012e	Support multiple alphas for segmented quantile. (#8758 )	2023-02-07 17:17:59 +08:00
Jiaming Yuan	28bb01aa22	Extract optional weight. (#8747 ) - Extract optional weight from coommon.h to reduce dependency on this header. - Add test.	2023-02-07 03:11:53 +08:00
Jiaming Yuan	a2e433a089	Fix empty DMatrix with categorical features. (#8739 )	2023-02-07 00:40:11 +08:00

1 2 3 4 5 ...

523 Commits