Jiaming Yuan 17fd3f55e9
Optimize adapter element counting on GPU. (#9209)
- Implement a simple `IterSpan` for passing iterators with size.
- Use shared memory for column size counts.
- Use one thread for each sample in row count to reduce atomic operations.
2023-05-30 23:28:43 +08:00
..