amdsc21
7ee4734d3a
rm device_helpers.hip.h from cu
2023-03-26 00:24:11 +01:00
amdsc21
595cd81251
add max shared mem workaround
2023-03-19 20:08:42 +01:00
amdsc21
4484c7f073
disable Optin Shared Mem
2023-03-15 02:10:16 +01:00
amdsc21
0ed5d3c849
finished histogram.cu
2023-03-09 21:28:37 +01:00
Jiaming Yuan
4d665b3fb0
Restore clang tidy test. ( #8861 )
2023-03-03 13:47:04 -08:00
Jiaming Yuan
594371e35b
Fix CPP lint. ( #8807 )
2023-02-15 20:16:35 +08:00
Jiaming Yuan
70c9b885ef
Extract floating point rounding routines. ( #8771 )
2023-02-12 04:26:41 +08:00
Jiaming Yuan
c6a8754c62
Define CUDA Context. ( #8604 )
...
We will transition to non-default and non-blocking CUDA stream.
2022-12-20 15:15:07 +08:00
Rory Mitchell
210915c985
Use integer gradients in gpu_hist split evaluation ( #8274 )
2022-10-11 12:16:27 +02:00
Rong Ou
668b8a0ea4
[Breaking] Switch from rabit to the collective communicator ( #8257 )
...
* Switch from rabit to the collective communicator
* fix size_t specialization
* really fix size_t
* try again
* add include
* more include
* fix lint errors
* remove rabit includes
* fix pylint error
* return dict from communicator context
* fix communicator shutdown
* fix dask test
* reset communicator mocklist
* fix distributed tests
* do not save device communicator
* fix jvm gpu tests
* add python test for federated communicator
* Update gputreeshap submodule
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 14:39:01 -08:00
Rory Mitchell
8f77677193
Use quantised gradients in gpu_hist histograms ( #8246 )
2022-09-26 17:35:35 +02:00
Philip Hyunsu Cho
56395d120b
Work around MSVC behavior wrt constexpr capture ( #8211 )
...
* Work around MSVC behavior wrt constexpr capture
* Fix lint
2022-08-31 11:42:08 -08:00
Rory Mitchell
1703dc330f
Optimise histogram kernels ( #8118 )
2022-08-18 14:07:26 +02:00
Jiaming Yuan
142a208a90
Fix compiler warnings. ( #8022 )
...
- Remove/fix unused parameters
- Remove deprecated code in rabit.
- Update dmlc-core.
2022-06-22 21:29:10 +08:00
Rory Mitchell
71d3b2e036
Fuse gpu_hist all-reduce calls where possible ( #7867 )
2022-05-17 13:27:50 +02:00
Jiaming Yuan
7a1d67f9cb
[breaking] Use integer atomic for GPU histogram. ( #7180 )
...
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.
[breaking] Drop non-deterministic histogram.
Use fixed point for shared memory.
This PR is to improve the performance of GPU Hist.
Co-authored-by: Andy Adinets <aadinets@nvidia.com>
2021-08-28 05:17:05 +08:00
Andrew Ziem
3e7e426b36
Fix spelling in documents ( #6948 )
...
* Update roxygen2 doc.
Co-authored-by: fis <jm.yuan@outlook.com>
2021-05-11 20:44:36 +08:00
Jiaming Yuan
bed7ae4083
Loop over thrust::reduce. ( #6229 )
...
* Check input chunk size of dqdm.
* Add doc for current limitation.
2020-10-14 10:40:56 +13:00
Jiaming Yuan
80c8547147
Make binary bin search reusable. ( #6058 )
...
* Move binary search row to hist util.
* Remove dead code.
2020-08-26 05:05:11 +08:00
Jiaming Yuan
a4de2f68e4
Use cudaOccupancyMaxPotentialBlockSize to calculate the block size. ( #5926 )
2020-07-23 14:24:42 +08:00
Andy Adinets
ac3f0e78dc
Split Features into Groups to Compute Histograms in Shared Memory ( #5795 )
2020-07-07 15:04:35 +12:00
Philip Hyunsu Cho
1d22a9be1c
Revert "Reorder includes. ( #5749 )" ( #5771 )
...
This reverts commit d3a0efbf162f3dceaaf684109e1178c150b32de3.
2020-06-09 10:29:28 -07:00
Jiaming Yuan
d3a0efbf16
Reorder includes. ( #5749 )
...
* Reorder includes.
* R.
2020-06-03 17:30:47 +12:00
Andy Adinets
73142041b9
For histograms, opting into maximum shared memory available per block. ( #5491 )
2020-04-21 14:56:42 +12:00
Rory Mitchell
3ad4333b0e
Partial rewrite EllpackPage ( #5352 )
2020-03-11 10:15:53 +13:00
Jiaming Yuan
8d06878bf9
Deterministic GPU histogram. ( #5361 )
...
* Use pre-rounding based method to obtain reproducible floating point
summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
2020-03-04 15:13:28 +08:00