* launch all reduce sequentially. * Fix gpu_exact test memory leak.
* Use Span in GPU exact updater. * Add a small test.