Move prediction cache to Learner. (#5220)

* Move prediction cache into Learner.

* Clean-ups

- Remove duplicated cache in Learner and GBM.
- Remove ad-hoc fix of invalid cache.
- Remove `PredictFromCache` in predictors.
- Remove prediction cache for linear altogether, as it's only moving the
  prediction into training process but doesn't provide any actual overall speed
  gain.
- The cache is now unique to Learner, which means the ownership is no longer
  shared by any other components.

* Changes

- Add version to prediction cache.
- Use weak ptr to check expired DMatrix.
- Pass shared pointer instead of raw pointer.
This commit is contained in:
Jiaming Yuan
2020-02-14 13:04:23 +08:00
committed by GitHub
parent 24ad9dec0b
commit c35cdecddd
19 changed files with 457 additions and 372 deletions

View File

@@ -19,11 +19,12 @@ rng = np.random.RandomState(1994)
@contextmanager
def captured_output():
"""
Reassign stdout temporarily in order to test printed statements
Taken from: https://stackoverflow.com/questions/4219717/how-to-assert-output-with-nosetest-unittest-in-python
"""Reassign stdout temporarily in order to test printed statements
Taken from:
https://stackoverflow.com/questions/4219717/how-to-assert-output-with-nosetest-unittest-in-python
Also works for pytest.
"""
new_out, new_err = StringIO(), StringIO()
old_out, old_err = sys.stdout, sys.stderr
@@ -42,10 +43,17 @@ class TestBasic(unittest.TestCase):
param = {'max_depth': 2, 'eta': 1,
'objective': 'binary:logistic'}
# specify validations set to watch performance
watchlist = [(dtest, 'eval'), (dtrain, 'train')]
watchlist = [(dtrain, 'train')]
num_round = 2
bst = xgb.train(param, dtrain, num_round, watchlist)
# this is prediction
bst = xgb.train(param, dtrain, num_round, watchlist, verbose_eval=True)
preds = bst.predict(dtrain)
labels = dtrain.get_label()
err = sum(1 for i in range(len(preds))
if int(preds[i] > 0.5) != labels[i]) / float(len(preds))
# error must be smaller than 10%
assert err < 0.1
preds = bst.predict(dtest)
labels = dtest.get_label()
err = sum(1 for i in range(len(preds))