* Move thread local entry into Learner.
This is an attempt to workaround CUDA context issue in static variable, where
the CUDA context can be released before device vector.
* Add PredictionEntry to thread local entry.
This eliminates one copy of prediction vector.
* Don't define CUDA C API in a namespace.
* - create a gpu metrics (internal) registry
- the objective is to separate the cpu and gpu implementations such that they evolve
indepedently. to that end, this approach will:
- preserve the same metrics configuration (from the end user perspective)
- internally delegate the responsibility to the gpu metrics builder when there is a
valid device present
- decouple the gpu metrics builder from the cpu ones to prevent misuse
- move away from including the cuda file from within the cc file and segregate the code
via ifdef's
* Use pre-rounding based method to obtain reproducible floating point
summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
* Add release note for 1.0.0
* Fix a small bug in the Python script that compiles the list of contributors
* Clarify governance of CI infrastructure; now PMC is formally in charge
* Address reviewer comment
* Fix typo
- move segment sorter to common
- this is the first of a handful of pr's that splits the larger pr #5326
- it moves this facility to common (from ranking objective class), so that it can be
used for metric computation
- it also wraps all the bald device pointers into span.
* Remove f-string, since it's not supported by Python 3.5 (#5330)
* Remove f-string, since it's not supported by Python 3.5
* Add Python 3.5 to CI, to ensure compatibility
* Remove duplicated matplotlib
* Show deprecation notice for Python 3.5
* Fix lint
* Fix lint
* Fix a unit test that mistook MINOR ver for PATCH ver
* Enforce only major version in JSON model schema
* Bump version to 1.1.0-SNAPSHOT
* Added a check call macro in jvm package, prevents executing other functions
from jvm when error occurred in XGBoost. For example, when prediction fails jvm
should not try to allocate memory based on the output prediction size.
Move this function into gbtree, and uses only updater for doing so. As now the predictor knows exactly how many trees to predict, there's no need for it to update the prediction cache.