* Use pre-rounding based method to obtain reproducible floating point
summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
* Remove `learning_rates`.
It's been deprecated since we have callback.
* Set `before_iteration` of `reset_learning_rate` to False to preserve
the initial learning rate, and comply to the term "reset".
Closes#4709.
* Tests for various `tree_method`.