xgboost

Author	SHA1	Message	Date
Jiaming Yuan	271f4a80e7	Use CUDA virtual memory for pinned memory allocation. (#10850 ) - Add a grow-only virtual memory allocator. - Define a driver API wrapper. Split up the runtime API wrapper.	2024-09-28 04:26:44 +08:00
Jiaming Yuan	e228c1a121	[EM] Make page concatenation optional. (#10826 ) This PR introduces a new parameter `extmem_concat_pages` to make the page concatenation optional for GPU hist. In addition, the document is updated for the new GPU-based external memory.	2024-09-24 06:19:28 +08:00
Jiaming Yuan	5f7f31d464	[EM] Refactor ellpack construction. (#10810 ) - Remove the calculation of n_symbols in the accessor. - Pack initialization steps into the parameter list. - Pass the context into various ctors. - Specialization for dense data to prepare for further compression.	2024-09-09 14:10:10 +08:00
Jiaming Yuan	142bdc73ec	[EM] Support SHAP contribution with QDM. (#10724 ) - Add GPU support. - Add external memory support. - Update the GPU tree shap.	2024-08-22 05:25:10 +08:00
Jiaming Yuan	2ecc85ffad	[EM] Support ExtMemQdm in the GPU predictor. (#10694 )	2024-08-13 12:21:11 +08:00
Jiaming Yuan	292bb677e5	[EM] Support mmap backed ellpack. (#10602 ) - Support resource view in ellpack. - Define the CUDA version of MMAP resource. - Define the CUDA version of malloc resource. - Refactor cuda runtime API wrappers, and add memory access related wrappers. - gather windows macros into a single header.	2024-07-18 08:20:21 +08:00
Jiaming Yuan	a5a58102e5	Revamp the rabit implementation. (#10112 ) This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features: - Federated learning for both CPU and GPU. - NCCL. - More data types. - A unified interface for all the underlying implementations. - Improved timeout handling for both tracker and workers. - Exhausted tests with metrics (fixed a couple of bugs along the way). - A reusable tracker for Python and JVM packages.	2024-05-20 11:56:23 +08:00
Jiaming Yuan	53fc17578f	Use `std::uint64_t` for row index. (#10120 ) - Use std::uint64_t instead of size_t to avoid implementation-defined type. - Rename to bst_idx_t, to account for other types of indexing. - Small cleanup to the base header.	2024-03-15 18:43:49 +08:00
Louis Desreumaux	edf501d227	Implement contribution prediction with QuantileDMatrix (#10043 ) --------- Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2024-02-19 21:03:29 +08:00
Jiaming Yuan	06bdc15e9b	[coll] Pass context to various functions. (#9772 ) * [coll] Pass context to various functions. In the future, the `Context` object would be required for collective operations, this PR passes the context object to some required functions to prepare for swapping out the implementation.	2023-11-08 09:54:05 +08:00
Jiaming Yuan	8c676c889d	Remove internal use of gpu_id. (#9568 )	2023-09-20 23:29:51 +08:00
Rong Ou	d8c3cc92ae	More support for column split in gpu predictor (#9562 )	2023-09-14 08:13:13 +08:00
Jiaming Yuan	f05294a6f2	Fix clang warnings. (#9447 ) - static function in header. (which is marked as unused due to translation unit visibility). - Implicit copy operator is deprecated. - Unused lambda capture. - Moving a temporary variable prevents copy elision.	2023-08-09 15:34:45 +08:00
Jiaming Yuan	645037e376	Improve test coverage with predictor configuration. (#9354 ) * Improve test coverage with predictor configuration. - Test with ext memory. - Test with QDM. - Test with dart.	2023-07-05 15:17:22 +08:00
Rong Ou	3a0f787703	Support column split in GPU predictor (#9343 )	2023-07-03 04:05:34 +08:00
Jiaming Yuan	54da4b3185	Cleanup to prepare for using mmap pointer in external memory. (#9317 ) - Update SparseDMatrix comment. - Use a pointer in the bitfield. We will replace the `std::vector<bool>` in `ColumnMatrix` with bitfield. - Clean up the page source. The timer is removed as it's inaccurate once we swap the mmap pointer into the page.	2023-06-22 06:43:11 +08:00
Rong Ou	962a20693f	More support for column split in cpu predictor (#9244 ) - Added column split support to `PredictInstance` and `PredictLeaf`. - Refactoring of tests.	2023-06-05 08:05:38 +08:00
Jiaming Yuan	08ce495b5d	Use Booster context in DMatrix. (#8896 ) - Pass context from booster to DMatrix. - Use context instead of integer for `n_threads`. - Check the consistency configuration for `max_bin`. - Test for all combinations of initialization options.	2023-04-28 21:47:14 +08:00
Jiaming Yuan	acc110c251	[MT-TREE] Support prediction cache and model slicing. (#8968 ) - Fix prediction range. - Support prediction cache in mt-hist. - Support model slicing. - Make the booster a Python iterable by defining `__iter__`. - Cleanup removed/deprecated parameters. - A new field in the output model `iteration_indptr` for pointing to the ranges of trees for each iteration.	2023-03-27 23:10:54 +08:00
Jiaming Yuan	151882dd26	Initial support for multi-target tree. (#8616 ) * Implement multi-target for hist. - Add new hist tree builder. - Move data fetchers for tests. - Dispatch function calls in gbm base on the tree type.	2023-03-22 23:49:56 +08:00
Jiaming Yuan	36a7396658	Replace dmlc any with std any. (#8892 )	2023-03-11 06:11:04 +08:00
Jiaming Yuan	5feee8d4a9	Define core multi-target regression tree structure. (#8884 ) - Define a new tree struct embedded in the `RegTree`. - Provide dispatching functions in `RegTree`. - Fix some c++-17 warnings about the use of nodiscard (currently we disable the warning on the CI). - Use uint32_t instead of size_t for `bst_target_t` as it has a defined size and can be used as part of dmlc parameter. - Hide the `Segment` struct inside the categorical split matrix.	2023-03-09 19:03:06 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Jiaming Yuan	fffb1fca52	Calculate `base_score` based on input labels for mae. (#8107 ) Fit an intercept as base score for abs loss.	2022-09-20 20:53:54 +08:00
Jiaming Yuan	2c70751d1e	Implement iterative DMatrix for CPU. (#8116 )	2022-07-26 22:34:21 +08:00
Jiaming Yuan	765097d514	Simplify inplace-predict. (#7910 ) Pass the `X` as part of Proxy DMatrix instead of an independent `dmlc::any`.	2022-05-18 17:52:00 +08:00
Jiaming Yuan	2775c2a1ab	Prepare external memory support for hist. (#7638 ) This PR prepares the GHistIndexMatrix to host the column matrix which is used by the hist tree method by accepting sparse_threshold parameter. Some cleanups are made to ensure the correct batch param is being passed into DMatrix along with some additional tests for correctness of SimpleDMatrix.	2022-02-10 16:58:02 +08:00
Jiaming Yuan	68cdbc9c16	Remove `omp_get_max_threads` in CPU predictor. (#7519 ) This is part of the on going effort to remove the dependency on global omp variables.	2022-01-04 22:12:15 +08:00
Jiaming Yuan	d33854af1b	[Breaking] Accept multi-dim meta info. (#7405 ) This PR changes base_margin into a 3-dim array, with one of them being reserved for multi-target classification. Also, a breaking change is made for binary serialization due to extra dimension along with a fix for saving the feature weights. Lastly, it unifies the prediction initialization between CPU and GPU. After this PR, the meta info setter in Python will be based on array interface.	2021-11-18 23:02:54 +08:00
Jiaming Yuan	a13321148a	Support multi-class with base margin. (#7381 ) This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support for most of the data types along with tests.	2021-11-02 13:38:00 +08:00
Robert Maynard	1a75f43304	Allow compilation with nvcc 11.4 (#7131 ) * Use type aliases for discard iterators * update to include host_vector as thrust 1.12 doesn't bring it in as a side-effect * cub::DispatchRadixSort requires signed offset types	2021-07-27 20:05:33 +08:00
Jiaming Yuan	1c8fdf2218	Remove use of `device_idx` in `dh::LaunchN`. (#7063 ) It's an unused parameter, removing it can make the CI log more readable.	2021-06-29 11:37:26 +08:00
Jiaming Yuan	8fa32fdda2	Implement categorical data support for SHAP. (#7053 ) * Add CPU implementation. * Update GPUTreeSHAP. * Add GPU implementation by defining custom split condition.	2021-06-25 19:02:46 +08:00
Jiaming Yuan	bbfffb444d	Fix race condition in CPU shap. (#7050 )	2021-06-21 10:03:15 +08:00
Jiaming Yuan	86715e4cd4	Support categorical data for dask functional interface and DQM. (#7043 ) * Support categorical data for dask functional interface and DQM. * Implement categorical data support for GPU GK-merge. * Add support for dask functional interface. * Add support for DQM. * Get newer cupy.	2021-06-18 13:06:52 +08:00
Jiaming Yuan	f79cc4a7a4	Implement categorical prediction for CPU and GPU predict leaf. (#7001 ) * Categorical prediction with CPU predictor and GPU predict leaf. * Implement categorical prediction for CPU prediction. * Implement categorical prediction for GPU predict leaf. * Refactor the prediction functions to have a unified get next node function. Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>	2021-06-11 10:11:45 +08:00
Jiaming Yuan	a59c7323b4	Fix inplace predict missing value. (#6787 )	2021-03-27 05:36:10 +08:00
Jiaming Yuan	e8c5c53e2f	Use `Predictor` for `dart`. (#6693 ) * Use normal predictor for dart booster. * Implement `inplace_predict` for dart. * Enable `dart` for dask interface now that it's thread-safe. * categorical data should be working out of box for dart now. The implementation is not very efficient as it has to pull back the data and apply weight for each tree, but still a significant improvement over previous implementation as now we no longer binary search for each sample. * Fix output prediction shape on dataframe.	2021-02-09 23:30:19 +08:00
Jiaming Yuan	4656b09d5d	[breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668 ) * Add a new API function for predicting on `DMatrix`. This function aligns with rest of the `XGBoosterPredictFrom` functions on semantic of function arguments. Purge `ntree_limit` from libxgboost, use iteration instead. * [dask] Use `inplace_predict` by default for dask sklearn models. * [dask] Run prediction shape inference on worker instead of client. The breaking change is in the Python sklearn `apply` function, I made it to be consistent with other prediction functions where `best_iteration` is used by default.	2021-02-08 18:26:32 +08:00
Jiaming Yuan	411592a347	Enhance inplace prediction. (#6653 ) * Accept array interface for csr and array. * Accept an optional proxy dmatrix for metainfo. This constructs an explicit `_ProxyDMatrix` type in Python. * Remove unused doc. * Add strict output.	2021-02-02 11:41:46 +08:00
Jiaming Yuan	c3c8e66fc9	Make prediction functions thread safe. (#6648 )	2021-01-28 23:29:43 +08:00
Philip Hyunsu Cho	c31e3efa7c	Pass correct split_type to GPU predictor (#6491 ) * Pass correct split_type to GPU predictor * Add a test	2020-12-11 19:30:00 -08:00
Honza Sterba	b0036b339b	Optionaly fail when gpu_id is set to invalid value (#6342 )	2020-11-28 15:14:12 +08:00
Jiaming Yuan	8a17610666	Implement GPU predict leaf. (#6187 )	2020-11-11 17:33:47 +08:00
Rory Mitchell	f0c3ff313f	Update GPUTreeShap, add docs (#6281 ) * Update GPUTreeShap, add docs * Fix test Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-27 18:22:12 +13:00
Igor Moura	d1254808d5	Clean up C++ warnings (#6213 )	2020-10-19 23:02:33 +08:00
Jiaming Yuan	798af22ff4	Add categorical data support to GPU predictor. (#6165 )	2020-09-29 11:25:34 +08:00
Jiaming Yuan	52c0b3f100	Fix error message. (#6176 )	2020-09-29 11:18:25 +08:00
Rory Mitchell	dda9e1e487	Update GPUTreeshap (#6163 ) * Reduce shap test duration * Test interoperability with shap package * Add feature interactions * Update GPUTreeShap	2020-09-28 09:43:47 +13:00
Jiaming Yuan	c6f2b8c841	Upgrade gputreeshap. (#6099 ) * Upgrade gputreeshap. Co-authored-by: Rory Mitchell <r.a.mitchell.nz@gmail.com>	2020-09-15 12:57:22 +12:00

1 2 3

117 Commits