Add prediction of feature contributions (#2003)

* Add prediction of feature contributions

This implements the idea described at http://blog.datadive.net/interpreting-random-forests/
which tries to give insight in how a prediction is composed of its feature contributions
and a bias.

* Support multi-class models

* Calculate learning_rate per-tree instead of using the one from the first tree

* Do not rely on node.base_weight * learning_rate having the same value as the node mean value (aka leaf value, if it were a leaf); instead calculate them (lazily) on-the-fly

* Add simple test for contributions feature

* Check against param.num_nodes instead of checking for non-zero length

* Loop over all roots instead of only the first
This commit is contained in:
Maurus Cuelenaere
2017-05-14 07:58:10 +02:00
committed by Vadim Khotilovich
parent e62be19c70
commit 6bd1869026
10 changed files with 205 additions and 5 deletions

View File

@@ -400,8 +400,11 @@ class LearnerImpl : public Learner {
bool output_margin,
std::vector<bst_float> *out_preds,
unsigned ntree_limit,
bool pred_leaf) const override {
if (pred_leaf) {
bool pred_leaf,
bool pred_contribs) const override {
if (pred_contribs) {
gbm_->PredictContribution(data, out_preds, ntree_limit);
} else if (pred_leaf) {
gbm_->PredictLeaf(data, out_preds, ntree_limit);
} else {
this->PredictRaw(data, out_preds, ntree_limit);