Refactor configuration [Part II]. (#4577)

* Refactor configuration [Part II].

* General changes:
** Remove `Init` methods to avoid ambiguity.
** Remove `Configure(std::map<>)` to avoid redundant copying and prepare for
   parameter validation. (`std::vector` is returned from `InitAllowUnknown`).
** Add name to tree updaters for easier debugging.

* Learner changes:
** Make `LearnerImpl` the only source of configuration.

    All configurations are stored and carried out by `LearnerImpl::Configure()`.

** Remove booster in C API.

    Originally kept for "compatibility reason", but did not state why.  So here
    we just remove it.

** Add a `metric_names_` field in `LearnerImpl`.
** Remove `LazyInit`.  Configuration will always be lazy.
** Run `Configure` before every iteration.

* Predictor changes:
** Allocate both cpu and gpu predictor.
** Remove cpu_predictor from gpu_predictor.

    `GBTree` is now used to dispatch the predictor.

** Remove some GPU Predictor tests.

* IO

No IO changes.  The binary model format stability is tested by comparing
hashing value of save models between two commits
This commit is contained in:
Jiaming Yuan
2019-07-20 08:34:56 -04:00
committed by GitHub
parent ad1192e8a3
commit f0064c07ab
69 changed files with 669 additions and 761 deletions

View File

@@ -402,7 +402,7 @@ struct GPUSketcher {
void SketchBatch(const SparsePage &batch, const MetaInfo &info) {
GPUDistribution dist =
GPUDistribution::Block(GPUSet::All(learner_param_.gpu_id, learner_param_.n_gpus,
GPUDistribution::Block(GPUSet::All(generic_param_.gpu_id, generic_param_.n_gpus,
batch.Size()));
// create device shards
@@ -429,8 +429,8 @@ struct GPUSketcher {
}
}
GPUSketcher(const tree::TrainParam &param, const LearnerTrainParam &learner_param, int gpu_nrows)
: param_(param), learner_param_(learner_param), gpu_batch_nrows_(gpu_nrows), row_stride_(0) {
GPUSketcher(const tree::TrainParam &param, const GenericParameter &generic_param, int gpu_nrows)
: param_(param), generic_param_(generic_param), gpu_batch_nrows_(gpu_nrows), row_stride_(0) {
}
/* Builds the sketches on the GPU for the dmatrix and returns the row stride
@@ -452,14 +452,14 @@ struct GPUSketcher {
private:
std::vector<std::unique_ptr<DeviceShard>> shards_;
const tree::TrainParam &param_;
const LearnerTrainParam &learner_param_;
const GenericParameter &generic_param_;
int gpu_batch_nrows_;
size_t row_stride_;
std::unique_ptr<SketchContainer> sketch_container_;
};
size_t DeviceSketch
(const tree::TrainParam &param, const LearnerTrainParam &learner_param, int gpu_batch_nrows,
(const tree::TrainParam &param, const GenericParameter &learner_param, int gpu_batch_nrows,
DMatrix *dmat, HistogramCuts *hmat) {
GPUSketcher sketcher(param, learner_param, gpu_batch_nrows);
// We only need to return the result in HistogramCuts container, so it is safe to

View File

@@ -291,7 +291,7 @@ class DenseCuts : public CutsBuilder {
* \return The row stride across the entire dataset.
*/
size_t DeviceSketch
(const tree::TrainParam& param, const LearnerTrainParam &learner_param, int gpu_batch_nrows,
(const tree::TrainParam& param, const GenericParameter &learner_param, int gpu_batch_nrows,
DMatrix* dmat, HistogramCuts* hmat);