xgboost

Author	SHA1	Message	Date
Jiaming Yuan	d062a9e009	Define pair generation strategies for LTR. (#8984 )	2023-03-30 12:00:35 +08:00
Rong Ou	ff26cd3212	More tests for column split and vertical federated learning (#8985 ) Added some more tests for the learner and fit_stump, for both column-wise distributed learning and vertical federated learning. Also moved the `IsRowSplit` and `IsColumnSplit` methods from the `DMatrix` to the `MetaInfo` since in some places we only have access to the `MetaInfo`. Added a new convenience method `IsVerticalFederatedLearning`. Some refactoring of the testing fixtures.	2023-03-28 16:40:26 +08:00
Jiaming Yuan	151882dd26	Initial support for multi-target tree. (#8616 ) * Implement multi-target for hist. - Add new hist tree builder. - Move data fetchers for tests. - Dispatch function calls in gbm base on the tree type.	2023-03-22 23:49:56 +08:00
Jiaming Yuan	ea04d4c46c	[doc] [dask] Troubleshooting NCCL errors. (#8943 )	2023-03-22 22:17:26 +08:00
Jiaming Yuan	a05799ed39	Specify char type in JSON. (#8949 ) char is defined as signed on x86 but unsigned on arm64 - Use `std::int8_t` instead of char. - Fix include when clang is pretending to be gcc.	2023-03-22 19:13:44 +08:00
Jiaming Yuan	5891f752c8	Rework the MAP metric. (#8931 ) - The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1. - Deterministic GPU. (no atomic add). - Fix top-k handling. - Precise definition of MAP. (There are other variants on how to handle top-k). - Refactor GPU ranking tests.	2023-03-22 17:45:20 +08:00
Jiaming Yuan	a093770f36	Partitioner for multi-target tree. (#8922 )	2023-03-16 18:49:34 +08:00
Jiaming Yuan	26209a42a5	Define git attributes for renormalization. (#8921 )	2023-03-16 02:43:11 +08:00
Jiaming Yuan	f186c87cf9	Check inf in data for all types of DMatrix. (#8911 )	2023-03-15 11:24:35 +08:00
Jiaming Yuan	8685556af2	Implement hist evaluator for multi-target tree. (#8908 )	2023-03-15 01:42:51 +08:00
Jiaming Yuan	8be6095ece	Implement NDCG cache. (#8893 )	2023-03-13 22:16:31 +08:00
Jiaming Yuan	5feee8d4a9	Define core multi-target regression tree structure. (#8884 ) - Define a new tree struct embedded in the `RegTree`. - Provide dispatching functions in `RegTree`. - Fix some c++-17 warnings about the use of nodiscard (currently we disable the warning on the CI). - Use uint32_t instead of size_t for `bst_target_t` as it has a defined size and can be used as part of dmlc parameter. - Hide the `Segment` struct inside the categorical split matrix.	2023-03-09 19:03:06 +08:00
Jiaming Yuan	46dfcc7d22	Define a new ranking parameter. (#8887 )	2023-03-09 17:46:24 +08:00
Jiaming Yuan	cad7401783	Disable gcc parallel extension if openmp is not available. (#8871 ) `<parallel/algorithm>` internally includes the <omp.h> header, which leads to an error when openmp is not available.	2023-03-06 22:51:06 +08:00
Jiaming Yuan	4d665b3fb0	Restore clang tidy test. (#8861 )	2023-03-03 13:47:04 -08:00
Rong Ou	2dc22e7aad	Take advantage of C++17 features (#8858 ) --------- Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2023-03-04 00:24:13 +08:00
Rong Ou	d9688f93c7	Support column-split in row partitioner (#8828 )	2023-02-26 04:43:35 +08:00
Rong Ou	a65ad0bd9c	Support column split in histogram builder (#8811 )	2023-02-17 22:37:01 +08:00
Jiaming Yuan	c0afdb6786	Fix CPU bin compression with categorical data. (#8809 ) * Fix CPU bin compression with categorical data. * The bug causes the maximum category to be lesser than 256 or the maximum number of bins when the input data is dense.	2023-02-16 04:20:34 +08:00
Jiaming Yuan	cce4af4acf	Initial support for quantile loss. (#8750 ) - Add support for Python. - Add objective.	2023-02-16 02:30:18 +08:00
Jiaming Yuan	282b1729da	Specify the number of threads for parallel sort. (#8735 ) * Specify the number of threads for parallel sort. - Pass context object into argsort. - Replace macros with inline functions.	2023-02-16 00:20:19 +08:00
Jiaming Yuan	31d3ec07af	Extract device algorithms. (#8789 )	2023-02-13 20:53:53 +08:00
Jiaming Yuan	457f704e3d	Add quantile metric. (#8761 )	2023-02-13 19:07:40 +08:00
Jiaming Yuan	d11a0044cf	Generalize prediction cache. (#8783 ) * Extract most of the functionality into `DMatrixCache`. * Move API entry to independent file to reduce dependency on `predictor.h` file. * Add test. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-02-13 12:36:43 +08:00
Jiaming Yuan	17b709acb9	Rename ranking utils to threading utils. (#8785 )	2023-02-12 05:41:18 +08:00
Jiaming Yuan	70c9b885ef	Extract floating point rounding routines. (#8771 )	2023-02-12 04:26:41 +08:00
Jiaming Yuan	5f76edd296	Extract make metric name from ranking metric. (#8768 ) - Extract the metric parsing routine from ranking. - Add a test. - Accept null for string view.	2023-02-09 18:30:21 +08:00
Jiaming Yuan	48cefa012e	Support multiple alphas for segmented quantile. (#8758 )	2023-02-07 17:17:59 +08:00
Jiaming Yuan	28bb01aa22	Extract optional weight. (#8747 ) - Extract optional weight from coommon.h to reduce dependency on this header. - Add test.	2023-02-07 03:11:53 +08:00
Jiaming Yuan	a2e433a089	Fix empty DMatrix with categorical features. (#8739 )	2023-02-07 00:40:11 +08:00
Rong Ou	66191e9926	Support cpu quantile sketch with column-wise data split (#8742 )	2023-02-05 14:26:24 +08:00
Jiaming Yuan	3760cede0f	Consistent use of context to specify number of threads. (#8733 ) - Use context in all tests. - Use context in R. - Use context in C API DMatrix initialization. (0 threads is used as dft).	2023-01-30 15:25:31 +08:00
Jiaming Yuan	21a28f2cc5	Small refactor for hist builder. (#8698 ) - Use span instead of vector as parameter. No perf change as the builder work on pointer. - Use const pointer for reg tree.	2023-01-30 14:06:41 +08:00
Jiaming Yuan	cfa994d57f	Multi-target support for L1 error. (#8652 ) - Add matrix support to the median function. - Iterate through each target for quantile computation.	2023-01-11 05:51:14 +08:00
James Lamb	fa44a33ee6	remove unused variables in JSON-parsing code (#8627 )	2023-01-04 15:50:33 +08:00
Jiaming Yuan	8d545ab2a2	Implement fit stump. (#8607 )	2023-01-04 04:14:51 +08:00
Jiaming Yuan	c6a8754c62	Define CUDA Context. (#8604 ) We will transition to non-default and non-blocking CUDA stream.	2022-12-20 15:15:07 +08:00
Jiaming Yuan	a10e4cba4e	Fix linalg iterator. (#8603 )	2022-12-16 23:05:03 +08:00
Jiaming Yuan	43a647a4dd	Fix inference with categorical feature. (#8591 )	2022-12-15 17:57:26 +08:00
Rong Ou	15a88ceef0	Fix deprecated CUB calls in CUDA 12.0 (#8578 )	2022-12-12 17:02:30 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Jiaming Yuan	e3bf5565ab	Extract transform iterator. (#8498 )	2022-12-05 21:37:07 +08:00
Robert Maynard	16f96b6cfb	Work with newer thrust and libcudacxx (#8454 ) * Thrust 1.17 removes the experimental/pinned_allocator. When xgboost is brought into a large project it can be compiled against Thrust 1.17+ which don't offer this experimental allocator. To ensure that going forward xgboost works in all environments we provide a xgboost namespaced version of the pinned_allocator that previously was in Thrust.	2022-11-11 04:22:53 +08:00
Dmitry Razdoburdin	5bd849f1b5	Unify the partitioner for hist and approx. Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com> Co-authored-by: jiamingy <jm.yuan@outlook.com>	2022-10-20 02:49:20 +08:00
Jiaming Yuan	031d66ec27	Configuration for init estimation. (#8343 ) * Configuration for init estimation. * Check whether the model needs configuration based on const attribute `ModelFitted` instead of a mutable state. * Add parameter `boost_from_average` to tell whether the user has specified base score. * Add tests.	2022-10-18 01:52:24 +08:00
Jiaming Yuan	3ef1703553	Allow using string view to find JSON value. (#8332 ) - Allow comparison between string and string view. - Fix compiler warnings.	2022-10-13 17:10:13 +08:00
Philip Hyunsu Cho	bc7a6ec603	Fix clang tidy (#8314 ) * Fix clang-tidy * Exempt clang-tidy from budget check * Move clang-tidy	2022-10-06 05:16:06 -08:00
Dmitry Razdoburdin	c24e9d712c	Dispatcher for template parameters of BuildHist Kernels (#8259 ) * Intoducing Column Wise Hist Building * linting * more linting * bug fixing * Removing column samping optimization for a while to simplify the review process. * linting * Removing unnecessary changes * Use DispatchBinType in hist_util.cc * Adding force_read_by column flag to buildhist. Adding tests for column wise buiilhist. * Introducing new dispatcher for compile time flags in hist building * fixing bug with using of DispatchBinType * Fixing building * Merging with master branch Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com> Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2022-10-06 03:02:29 -08:00
Rong Ou	668b8a0ea4	[Breaking] Switch from rabit to the collective communicator (#8257 ) * Switch from rabit to the collective communicator * fix size_t specialization * really fix size_t * try again * add include * more include * fix lint errors * remove rabit includes * fix pylint error * return dict from communicator context * fix communicator shutdown * fix dask test * reset communicator mocklist * fix distributed tests * do not save device communicator * fix jvm gpu tests * add python test for federated communicator * Update gputreeshap submodule Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-05 14:39:01 -08:00
Rory Mitchell	d686bf52a6	Reduce time for some multi-gpu tests (#8288 ) * Faster dask tests * Reuse AllReducer objects in tests. * Faster boost from prediction tests. * Use rmm dask fixture. * Speed up dask demo. * mypy * Format with black. * mypy * Clang-tidy Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-04 02:49:33 -08:00

1 2 3 4 5 ...

453 Commits