xgboost

Author	SHA1	Message	Date
amdsc21	f0b8c02f15	merge latest changes	2023-03-10 22:10:20 +01:00
Jiaming Yuan	2aa838c75e	Define multi-strategy parameter. (#8890 )	2023-03-11 02:58:01 +08:00
amdsc21	f0febfbcac	finish gpu_predictor.cu	2023-03-10 01:29:54 +01:00
Jiaming Yuan	5feee8d4a9	Define core multi-target regression tree structure. (#8884 ) - Define a new tree struct embedded in the `RegTree`. - Provide dispatching functions in `RegTree`. - Fix some c++-17 warnings about the use of nodiscard (currently we disable the warning on the CI). - Use uint32_t instead of size_t for `bst_target_t` as it has a defined size and can be used as part of dmlc parameter. - Hide the `Segment` struct inside the categorical split matrix.	2023-03-09 19:03:06 +08:00
amdsc21	ed45aa2816	Merge branch 'master' into dev-hui	2023-03-08 00:39:33 +01:00
amdsc21	c51a1c9aae	rename hip.cc to hip	2023-03-07 05:39:53 +01:00
amdsc21	6039a71e6c	add hip structure	2023-03-07 02:17:19 +01:00
Mauro Leggieri	90c0633a28	Fixes compilation errors on MSVC x86 targets (#8823 )	2023-02-26 03:20:28 +08:00
Rong Ou	a65ad0bd9c	Support column split in histogram builder (#8811 )	2023-02-17 22:37:01 +08:00
Jiaming Yuan	594371e35b	Fix CPP lint. (#8807 )	2023-02-15 20:16:35 +08:00
Jiaming Yuan	d11a0044cf	Generalize prediction cache. (#8783 ) * Extract most of the functionality into `DMatrixCache`. * Move API entry to independent file to reduce dependency on `predictor.h` file. * Add test. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-02-13 12:36:43 +08:00
Jiaming Yuan	34eee56256	Fix compiler warnings. (#8703 ) Fix warnings about signed/unsigned comparisons.	2023-01-21 15:16:23 +08:00
Rong Ou	78396f8a6e	Initial support for column-split cpu predictor (#8676 )	2023-01-18 06:33:13 +08:00
Jiaming Yuan	beefd28471	Split up SHAP from `RegTree`. (#8612 ) * Split up SHAP from `RegTree`. Simplify the tree interface.	2023-01-04 18:17:47 +08:00
Jiaming Yuan	43a647a4dd	Fix inference with categorical feature. (#8591 )	2022-12-15 17:57:26 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Jiaming Yuan	fffb1fca52	Calculate `base_score` based on input labels for mae. (#8107 ) Fit an intercept as base score for abs loss.	2022-09-20 20:53:54 +08:00
Jiaming Yuan	2c70751d1e	Implement iterative DMatrix for CPU. (#8116 )	2022-07-26 22:34:21 +08:00
Jiaming Yuan	142a208a90	Fix compiler warnings. (#8022 ) - Remove/fix unused parameters - Remove deprecated code in rabit. - Update dmlc-core.	2022-06-22 21:29:10 +08:00
Jiaming Yuan	765097d514	Simplify inplace-predict. (#7910 ) Pass the `X` as part of Proxy DMatrix instead of an independent `dmlc::any`.	2022-05-18 17:52:00 +08:00
Jiaming Yuan	b52c4e13b0	[dask] Fix empty partition with pandas input. (#7644 ) Empty partition is different from empty dataset. For the former case, each worker has non-empty dask collections, but each collection might contain empty partition.	2022-02-14 19:35:51 +08:00
Jiaming Yuan	2775c2a1ab	Prepare external memory support for hist. (#7638 ) This PR prepares the GHistIndexMatrix to host the column matrix which is used by the hist tree method by accepting sparse_threshold parameter. Some cleanups are made to ensure the correct batch param is being passed into DMatrix along with some additional tests for correctness of SimpleDMatrix.	2022-02-10 16:58:02 +08:00
Jiaming Yuan	e5e47c3c99	Clarify the behavior of invalid categorical value handling. (#7529 )	2022-01-13 16:11:52 +08:00
Jiaming Yuan	68cdbc9c16	Remove `omp_get_max_threads` in CPU predictor. (#7519 ) This is part of the on going effort to remove the dependency on global omp variables.	2022-01-04 22:12:15 +08:00
Jiaming Yuan	557ffc4bf5	Reduce base margin to 2 dim for now. (#7455 )	2021-11-27 00:46:13 +08:00
Jiaming Yuan	d33854af1b	[Breaking] Accept multi-dim meta info. (#7405 ) This PR changes base_margin into a 3-dim array, with one of them being reserved for multi-target classification. Also, a breaking change is made for binary serialization due to extra dimension along with a fix for saving the feature weights. Lastly, it unifies the prediction initialization between CPU and GPU. After this PR, the meta info setter in Python will be based on array interface.	2021-11-18 23:02:54 +08:00
Jiaming Yuan	a13321148a	Support multi-class with base margin. (#7381 ) This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support for most of the data types along with tests.	2021-11-02 13:38:00 +08:00
Jiaming Yuan	d8a549e6ac	Avoid thread block with sparse data. (#7255 )	2021-09-25 13:11:34 +08:00
Robert Maynard	1a75f43304	Allow compilation with nvcc 11.4 (#7131 ) * Use type aliases for discard iterators * update to include host_vector as thrust 1.12 doesn't bring it in as a side-effect * cub::DispatchRadixSort requires signed offset types	2021-07-27 20:05:33 +08:00
Jiaming Yuan	1c8fdf2218	Remove use of `device_idx` in `dh::LaunchN`. (#7063 ) It's an unused parameter, removing it can make the CI log more readable.	2021-06-29 11:37:26 +08:00
Jiaming Yuan	8fa32fdda2	Implement categorical data support for SHAP. (#7053 ) * Add CPU implementation. * Update GPUTreeSHAP. * Add GPU implementation by defining custom split condition.	2021-06-25 19:02:46 +08:00
Jiaming Yuan	bbfffb444d	Fix race condition in CPU shap. (#7050 )	2021-06-21 10:03:15 +08:00
Jiaming Yuan	29f8fd6fee	Support categorical split in tree model dump. (#7036 )	2021-06-18 16:46:20 +08:00
Jiaming Yuan	86715e4cd4	Support categorical data for dask functional interface and DQM. (#7043 ) * Support categorical data for dask functional interface and DQM. * Implement categorical data support for GPU GK-merge. * Add support for dask functional interface. * Add support for DQM. * Get newer cupy.	2021-06-18 13:06:52 +08:00
Jiaming Yuan	f79cc4a7a4	Implement categorical prediction for CPU and GPU predict leaf. (#7001 ) * Categorical prediction with CPU predictor and GPU predict leaf. * Implement categorical prediction for CPU prediction. * Implement categorical prediction for GPU predict leaf. * Refactor the prediction functions to have a unified get next node function. Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>	2021-06-11 10:11:45 +08:00
Jiaming Yuan	a59c7323b4	Fix inplace predict missing value. (#6787 )	2021-03-27 05:36:10 +08:00
Louis Desreumaux	9b530e5697	Improve OpenMP exception handling (#6680 )	2021-02-25 13:56:16 +08:00
ShvetsKS	9f15b9e322	Optimize CPU prediction (#6696 ) Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>	2021-02-16 14:41:22 +08:00
Jiaming Yuan	e8c5c53e2f	Use `Predictor` for `dart`. (#6693 ) * Use normal predictor for dart booster. * Implement `inplace_predict` for dart. * Enable `dart` for dask interface now that it's thread-safe. * categorical data should be working out of box for dart now. The implementation is not very efficient as it has to pull back the data and apply weight for each tree, but still a significant improvement over previous implementation as now we no longer binary search for each sample. * Fix output prediction shape on dataframe.	2021-02-09 23:30:19 +08:00
Jiaming Yuan	4656b09d5d	[breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668 ) * Add a new API function for predicting on `DMatrix`. This function aligns with rest of the `XGBoosterPredictFrom` functions on semantic of function arguments. Purge `ntree_limit` from libxgboost, use iteration instead. * [dask] Use `inplace_predict` by default for dask sklearn models. * [dask] Run prediction shape inference on worker instead of client. The breaking change is in the Python sklearn `apply` function, I made it to be consistent with other prediction functions where `best_iteration` is used by default.	2021-02-08 18:26:32 +08:00
Jiaming Yuan	411592a347	Enhance inplace prediction. (#6653 ) * Accept array interface for csr and array. * Accept an optional proxy dmatrix for metainfo. This constructs an explicit `_ProxyDMatrix` type in Python. * Remove unused doc. * Add strict output.	2021-02-02 11:41:46 +08:00
Jiaming Yuan	c3c8e66fc9	Make prediction functions thread safe. (#6648 )	2021-01-28 23:29:43 +08:00
Jiaming Yuan	f2f7dd87b8	Use view for `SparsePage` exclusively. (#6590 )	2021-01-11 18:04:55 +08:00
Philip Hyunsu Cho	c31e3efa7c	Pass correct split_type to GPU predictor (#6491 ) * Pass correct split_type to GPU predictor * Add a test	2020-12-11 19:30:00 -08:00
Honza Sterba	b0036b339b	Optionaly fail when gpu_id is set to invalid value (#6342 )	2020-11-28 15:14:12 +08:00
Jiaming Yuan	8a17610666	Implement GPU predict leaf. (#6187 )	2020-11-11 17:33:47 +08:00
ShvetsKS	d411f98d26	simple fix for static shedule in predict (#6357 ) Co-authored-by: ShvetsKS <kirill.shvets@intel.com>	2020-11-09 17:01:30 +08:00
Igor Moura	5e1e972aea	Clean up warnings (#6325 )	2020-10-30 23:50:29 +08:00
Jiaming Yuan	c4da967b5c	Support unity build. (#6295 ) * Support unity build. * Setup on Windows Jenkins. * Revert "Setup on Windows Jenkins." This reverts commit 8345cb8d2b009eec8ae9fa6f16412a7c9b6ec12c.	2020-10-28 11:49:28 -07:00
Rory Mitchell	f0c3ff313f	Update GPUTreeShap, add docs (#6281 ) * Update GPUTreeShap, add docs * Fix test Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-27 18:22:12 +13:00

1 2 3

133 Commits