xgboost

Author	SHA1	Message	Date
Rory Mitchell	8f77677193	Use quantised gradients in gpu_hist histograms (#8246 )	2022-09-26 17:35:35 +02:00
Jiaming Yuan	b5eb36f1af	Add `max_cat_threshold` to GPU and handle missing cat values. (#8212 )	2022-09-07 00:57:51 +08:00
Philip Hyunsu Cho	56395d120b	Work around MSVC behavior wrt constexpr capture (#8211 ) * Work around MSVC behavior wrt constexpr capture * Fix lint	2022-08-31 11:42:08 -08:00
Rory Mitchell	1703dc330f	Optimise histogram kernels (#8118 )	2022-08-18 14:07:26 +02:00
Rory Mitchell	1be09848a7	Refactor split valuation kernel (#8073 )	2022-07-21 15:41:50 +02:00
Jiaming Yuan	abaa593aa0	Fix compiler warnings. (#8059 ) - Remove unused parameters. - Avoid comparison of different signedness.	2022-07-14 05:29:56 +08:00
Rory Mitchell	0bdaca25ca	Use single precision in gain calculation, use pointers instead of span. (#8051 )	2022-07-12 21:56:27 +02:00
Rory Mitchell	794cbaa60a	Fuse split evaluation kernels (#8026 )	2022-07-05 10:24:31 +02:00
Rory Mitchell	bc4f802b17	Batch UpdatePosition using cudaMemcpy (#7964 )	2022-06-30 17:52:40 +02:00
Jiaming Yuan	142a208a90	Fix compiler warnings. (#8022 ) - Remove/fix unused parameters - Remove deprecated code in rabit. - Update dmlc-core.	2022-06-22 21:29:10 +08:00
Jiaming Yuan	1a33b50a0d	Fix compiler warnings. (#7974 ) - Remove unused parameters. There are still many warnings that are not yet addressed. Currently, the warnings in dmlc-core dominate the error log. - Remove `distributed` parameter from metric. - Fixes some warnings about signed comparison.	2022-06-06 22:56:25 +08:00
Rory Mitchell	71d3b2e036	Fuse gpu_hist all-reduce calls where possible (#7867 )	2022-05-17 13:27:50 +02:00
Rory Mitchell	7ef54e39ec	Small refactor to categoricals (#7858 )	2022-05-05 17:47:02 +02:00
Jiaming Yuan	317d7be6ee	Always use partition based categorical splits. (#7857 )	2022-05-03 22:30:32 +08:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Jiaming Yuan	1d468e20a4	Optimize GPU evaluation function for categorical data. (#7705 ) * Use transform and cache.	2022-02-28 17:46:29 +08:00
Jiaming Yuan	d625dc2047	Work around nvcc error. (#7673 )	2022-02-19 01:41:46 +08:00
Jiaming Yuan	0d0abe1845	Support optimal partitioning for GPU hist. (#7652 ) * Implement `MaxCategory` in quantile. * Implement partition-based split for GPU evaluation. Currently, it's based on the existing evaluation function. * Extract an evaluator from GPU Hist to store the needed states. * Added some CUDA stream/event utilities. * Update document with references. * Fixed a bug in approx evaluator where the number of data points is less than the number of categories.	2022-02-15 03:03:12 +08:00
Ginko Balboa	29bfa94bb6	Fix external memory with gpu_hist and subsampling combination bug. (#7481 ) Instead of accessing data from the `original_page_`, access the data from the first page of the available batch. fix #7476 Co-authored-by: jiamingy <jm.yuan@outlook.com>	2021-12-24 11:15:35 +08:00
Jiaming Yuan	7f399eac8b	Use double for GPU Hist node sum. (#7507 )	2021-12-22 08:41:35 +08:00
Jiaming Yuan	ccdabe4512	Support building gradient index with cat data. (#7371 )	2021-11-03 22:37:37 +08:00
Jiaming Yuan	c311a8c1d8	Enable compiling with system cub. (#7232 ) - Tested with all CUDA 11.x. - Workaround cub scan by using discard iterator in AUC. - Limit the size of Argsort when compiled with CUDA cub.	2021-09-17 14:28:18 +08:00
Jiaming Yuan	3515931305	Initial support for external memory in gradient index. (#7183 ) * Add hessian to batch param in preparation of new approx impl. * Extract a push method for gradient index matrix. * Use span instead of vector ref for hessian in sketching. * Create a binary format for gradient index.	2021-09-13 12:40:56 +08:00
Jiaming Yuan	7a1d67f9cb	[breaking] Use integer atomic for GPU histogram. (#7180 ) On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor. [breaking] Drop non-deterministic histogram. Use fixed point for shared memory. This PR is to improve the performance of GPU Hist. Co-authored-by: Andy Adinets <aadinets@nvidia.com>	2021-08-28 05:17:05 +08:00
Jiaming Yuan	ee8d1f5ed8	Fix histogram truncation. (#7181 ) * Fix truncation. * Lint. * lint.	2021-08-24 18:34:32 -07:00
Jiaming Yuan	bf562bd33c	Remove unused code. (#7175 )	2021-08-18 14:02:19 +08:00
Robert Maynard	1a75f43304	Allow compilation with nvcc 11.4 (#7131 ) * Use type aliases for discard iterators * update to include host_vector as thrust 1.12 doesn't bring it in as a side-effect * cub::DispatchRadixSort requires signed offset types	2021-07-27 20:05:33 +08:00
Jiaming Yuan	bd1f3a38f0	Rewrite sparse dmatrix using callbacks. (#7092 ) - Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves. - Remove use of threaded iterator and IO queue. - Remove `page_size`. - Make sure the number of pages in memory is bounded. - Make sure the cache can not be violated. - Provide an interface for internal algorithms to process data asynchronously.	2021-07-16 12:33:31 +08:00
Jiaming Yuan	1c8fdf2218	Remove use of `device_idx` in `dh::LaunchN`. (#7063 ) It's an unused parameter, removing it can make the CI log more readable.	2021-06-29 11:37:26 +08:00
ShvetsKS	57c732655e	Merge lossgude and depthwise strategies for CPU hist (#7007 ) * fix java/scala test: max depth is also valid parameter for lossguide Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-06-03 01:49:43 +08:00
Andrew Ziem	3e7e426b36	Fix spelling in documents (#6948 ) * Update roxygen2 doc. Co-authored-by: fis <jm.yuan@outlook.com>	2021-05-11 20:44:36 +08:00
Philip Hyunsu Cho	4230dcb614	Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist (#6757 ) * Revert "gpu_hist performance tweaks (#5707)" This reverts commit `f779980f7e`. * Address reviewer's comment * Fix build error	2021-03-18 13:56:10 -07:00
Igor Moura	d1254808d5	Clean up C++ warnings (#6213 )	2020-10-19 23:02:33 +08:00
Jiaming Yuan	bed7ae4083	Loop over `thrust::reduce`. (#6229 ) * Check input chunk size of dqdm. * Add doc for current limitation.	2020-10-14 10:40:56 +13:00
Jiaming Yuan	444131a2e6	Add categorical data support to GPU Hist. (#6164 )	2020-09-29 11:27:25 +08:00
Jiaming Yuan	2fcc4f2886	Unify evaluation functions. (#6037 )	2020-08-26 14:23:27 +08:00
Jiaming Yuan	80c8547147	Make binary bin search reusable. (#6058 ) * Move binary search row to hist util. * Remove dead code.	2020-08-26 05:05:11 +08:00
Jiaming Yuan	e4a273e1da	Fix evaluate root split. (#5948 )	2020-07-29 19:33:29 +08:00
Jiaming Yuan	a4de2f68e4	Use `cudaOccupancyMaxPotentialBlockSize` to calculate the block size. (#5926 )	2020-07-23 14:24:42 +08:00
Andy Adinets	ac3f0e78dc	Split Features into Groups to Compute Histograms in Shared Memory (#5795 )	2020-07-07 15:04:35 +12:00
Philip Hyunsu Cho	1d22a9be1c	Revert "Reorder includes. (#5749 )" (#5771 ) This reverts commit `d3a0efbf16`.	2020-06-09 10:29:28 -07:00
Jiaming Yuan	d3a0efbf16	Reorder includes. (#5749 ) * Reorder includes. * R.	2020-06-03 17:30:47 +12:00
Rory Mitchell	f779980f7e	gpu_hist performance tweaks (#5707 ) * Remove device vectors * Remove allreduce synchronize * Remove double buffer	2020-05-29 16:48:53 +12:00
Rory Mitchell	fcf57823b6	Reduce device synchronisation (#5631 ) * Reduce device synchronisation * Initialise pinned memory	2020-05-07 21:19:46 +12:00
Jiaming Yuan	eaf2a00b5c	Enhance nvtx support. (#5636 )	2020-05-06 22:54:24 +08:00
Rory Mitchell	b9649e7b8e	Refactor gpu_hist split evaluation (#5610 ) * Refactor * Rewrite evaluate splits * Add more tests	2020-04-30 08:58:12 +12:00
Andy Adinets	73142041b9	For histograms, opting into maximum shared memory available per block. (#5491 )	2020-04-21 14:56:42 +12:00
Rory Mitchell	b2827a80e1	Use non-synchronising scan (#5560 )	2020-04-20 15:51:34 +12:00
Rory Mitchell	d6d1035950	gpu_hist performance fixes (#5558 ) * Remove unnecessary cuda API calls * Fix histogram memory growth	2020-04-19 12:21:13 +12:00
Rory Mitchell	e268fb0093	Use thrust functions instead of custom functions (#5544 )	2020-04-16 21:41:16 +12:00

1 2

60 Commits