* Use pre-rounding based method to obtain reproducible floating point
summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
This PR fixes tree weights in dart being ignored when computing contributions.
* Fix ellpack page source link.
* Add tree weights to compute contribution.