Commit Graph

182 Commits

Author SHA1 Message Date
Jiaming Yuan
3a4f51f39f Avoid calling CUDA code on CPU for linear model. (#7154) 2021-09-01 10:45:31 +08:00
Jiaming Yuan
d080b5a953 Fix model slicing. (#7149)
* Use correct pointer.
* Remove best_iteration/best_score.
2021-08-03 11:51:56 +08:00
Jiaming Yuan
1c8fdf2218 Remove use of device_idx in dh::LaunchN. (#7063)
It's an unused parameter, removing it can make the CI log more readable.
2021-06-29 11:37:26 +08:00
Jiaming Yuan
663136aa08 Implement feature score for linear model. (#7048)
* Add feature score support for linear model.
* Port R interface to the new implementation.
* Add linear model support in Python.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2021-06-25 14:34:02 +08:00
Jiaming Yuan
7dd29ffd47 Implement feature score in GBTree. (#7041)
* Categorical data support.
* Eliminate text parsing during feature score computation.
2021-06-18 11:53:16 +08:00
Jiaming Yuan
5c2d7a18c9 Parallel model dump for trees. (#7040) 2021-06-15 14:08:26 +08:00
Jiaming Yuan
72f9daf9b6 Fix gpu_id with custom objective. (#7015) 2021-06-09 14:51:17 +08:00
Jiaming Yuan
816b789bf0 Add predictor to skl constructor. (#7000) 2021-05-29 04:52:56 +08:00
Jiaming Yuan
86e60e3ba8 Guard against index error in prediction. (#6982)
* Remove `best_ntree_limit` from documents.
2021-05-25 23:24:59 +08:00
Livius
a4886c404a Fix compilation error on x86 (#6964)
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2021-05-14 13:31:49 +08:00
Andrew Ziem
3e7e426b36 Fix spelling in documents (#6948)
* Update roxygen2 doc.

Co-authored-by: fis <jm.yuan@outlook.com>
2021-05-11 20:44:36 +08:00
Jiaming Yuan
556a83022d Implement unified update prediction cache for (gpu_)hist. (#6860)
* Implement utilites for linalg.
* Unify the update prediction cache functions.
* Implement update prediction cache for multi-class gpu hist.
2021-04-17 00:29:34 +08:00
Jiaming Yuan
79b8b560d2 Optimize dart inplace predict perf. (#6804) 2021-03-31 15:20:54 +08:00
Jiaming Yuan
a7083d3c13 Fix dart inplace prediction with GPU input. (#6777)
* Fix dart inplace predict with data on GPU, which might trigger a fatal check
for device access right.
* Avoid copying data whenever possible.
2021-03-25 12:00:32 +08:00
Louis Desreumaux
9b530e5697 Improve OpenMP exception handling (#6680) 2021-02-25 13:56:16 +08:00
ShvetsKS
9a0399e898 Removed unnecessary PredictBatch calls (#6700)
Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>
2021-02-10 20:15:14 +08:00
Jiaming Yuan
e8c5c53e2f Use Predictor for dart. (#6693)
* Use normal predictor for dart booster.
* Implement `inplace_predict` for dart.
* Enable `dart` for dask interface now that it's thread-safe.
* categorical data should be working out of box for dart now.

The implementation is not very efficient as it has to pull back the data and
apply weight for each tree, but still a significant improvement over previous
implementation as now we no longer binary search for each sample.

* Fix output prediction shape on dataframe.
2021-02-09 23:30:19 +08:00
Jiaming Yuan
4656b09d5d [breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668)
* Add a new API function for predicting on `DMatrix`.  This function aligns
with rest of the `XGBoosterPredictFrom*` functions on semantic of function
arguments.
* Purge `ntree_limit` from libxgboost, use iteration instead.
* [dask] Use `inplace_predict` by default for dask sklearn models.
* [dask] Run prediction shape inference on worker instead of client.

The breaking change is in the Python sklearn `apply` function, I made it to be
consistent with other prediction functions where `best_iteration` is used by
default.
2021-02-08 18:26:32 +08:00
Jiaming Yuan
411592a347 Enhance inplace prediction. (#6653)
* Accept array interface for csr and array.
* Accept an optional proxy dmatrix for metainfo.

This constructs an explicit `_ProxyDMatrix` type in Python.

* Remove unused doc.
* Add strict output.
2021-02-02 11:41:46 +08:00
Jiaming Yuan
d132933550 Remove type check for solaris. (#6610) 2021-01-16 02:58:19 +08:00
ShvetsKS
7f4d3a91b9 Multiclass prediction caching for CPU Hist (#6550)
Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>
2021-01-13 04:42:07 +08:00
Jiaming Yuan
f2f7dd87b8 Use view for SparsePage exclusively. (#6590) 2021-01-11 18:04:55 +08:00
Jiaming Yuan
ca3da55de4 Support early stopping with training continuation, correct num boosted rounds. (#6506)
* Implement early stopping with training continuation.

* Add new C API for obtaining boosted rounds.

* Fix off by 1 in `save_best`.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-12-17 19:59:19 +08:00
Jiaming Yuan
8a17610666 Implement GPU predict leaf. (#6187) 2020-11-11 17:33:47 +08:00
Jack Dunn
51e6531315 Fix missing space in warning message (#6340) 2020-11-04 06:03:16 -05:00
Jiaming Yuan
2cc9662005 Support slicing tree model (#6302)
This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone.

* Implement the save_best option in early stopping.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-11-02 23:27:39 -08:00
Igor Moura
5e1e972aea Clean up warnings (#6325) 2020-10-30 23:50:29 +08:00
Jiaming Yuan
3310e208fd Fix inplace prediction interval. (#6259)
* Add back the interval in call.
* Make the interval non-optional.
2020-10-28 13:13:59 +08:00
Jiaming Yuan
cc76724762 Reduce warning. (#6273) 2020-10-27 12:24:19 -07:00
Igor Moura
d1254808d5 Clean up C++ warnings (#6213) 2020-10-19 23:02:33 +08:00
Jiaming Yuan
5037abeb86 Fix linear gpu input (#6255) 2020-10-19 12:02:36 +08:00
Rory Mitchell
dda9e1e487 Update GPUTreeshap (#6163)
* Reduce shap test duration

* Test interoperability with shap package

* Add feature interactions

* Update GPUTreeShap
2020-09-28 09:43:47 +13:00
Rory Mitchell
9a4e8b1d81 GPUTreeShap (#6038) 2020-08-25 12:47:41 +12:00
Qi Zhang
989ddd036f Swap byte-order in binary serializer to support big-endian arch (#5813)
* fixed some endian issues

* Use dmlc::ByteSwap() to simplify code

* Fix lint check

* [CI] Add test for s390x

* Download latest CMake on s390x

* Fix a bug in my code

* Save magic number in dmatrix with byteswap on big-endian machine

* Save version in binary with byteswap on big-endian machine

* Load scalar with byteswap in MetaInfo

* Add a debugging message

* Handle arrays correctly when byteswapping

* EOF can also be 255

* Handle magic number in MetaInfo carefully

* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model

* Handle missing packages in Python tests

* Don't use boto3 in model compatibility tests

* Add s390 Docker file for local testing

* Add model compatibility tests

* Add R compatibility test

* Revert "Add R compatibility test"

This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.

Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 14:47:17 -07:00
Vladislav Epifanov
388f975cf5 Introducing DPC++-based plugin (predictor, objective function) supporting oneAPI programming model (#5825)
* Added plugin with DPC++-based predictor and objective function

* Update CMakeLists.txt

* Update regression_obj_oneapi.cc

* Added README.md for OneAPI plugin

* Added OneAPI predictor support to gbtree

* Update README.md

* Merged kernels in gradient computation. Enabled multiple loss functions with DPC++ backend

* Aligned plugin CMake files with latest master changes. Fixed whitespace typos

* Removed debug output

* [CI] Make oneapi_plugin a CMake target

* Added tests for OneAPI plugin for predictor and obj. functions

* Temporarily switched to default selector for device dispacthing in OneAPI plugin to enable execution in environments without gpus

* Updated readme file.

* Fixed USM usage in predictor

* Removed workaround with explicit templated names for DPC++ kernels

* Fixed warnings in plugin tests

* Fix CMake build of gtest

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-08 18:40:40 -07:00
Jiaming Yuan
9c6e791e64 Enforce tree order in JSON. (#5974)
* Make JSON model IO more future proof by using tree id in model loading.
2020-08-05 16:44:52 +08:00
Philip Hyunsu Cho
1d22a9be1c Revert "Reorder includes. (#5749)" (#5771)
This reverts commit d3a0efbf16.
2020-06-09 10:29:28 -07:00
Jiaming Yuan
d3a0efbf16 Reorder includes. (#5749)
* Reorder includes.

* R.
2020-06-03 17:30:47 +12:00
Jiaming Yuan
7d93932423 Better message when no GPU is found. (#5594) 2020-04-26 10:00:57 +08:00
Jiaming Yuan
9c1103e06c [Breaking] Set output margin to True for custom objective. (#5564)
* Set output margin to True for custom objective in Python and R.

* Add a demo for writing multi-class custom objective function.

* Run tests on selected demos.
2020-04-20 20:44:12 +08:00
Rory Mitchell
093e2227e3 Serialise booster after training to reset state (#5484)
* Serialise booster after training to reset state

* Prevent process_type being set on load

* Check for correct updater sequence
2020-04-11 16:27:12 +12:00
Jiaming Yuan
bd653fad4c Remove distcol updater. (#5507)
Closes #5498.
2020-04-10 12:52:56 +08:00
Jiaming Yuan
6671b42dd4 Use ellpack for prediction only when sparsepage doesn't exist. (#5504) 2020-04-10 12:15:46 +08:00
Jiaming Yuan
0012f2ef93 Upgrade clang-tidy on CI. (#5469)
* Correct all clang-tidy errors.
* Upgrade clang-tidy to 10 on CI.

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-04-05 04:42:29 +08:00
Jiaming Yuan
6601a641d7 Thread safe, inplace prediction. (#5389)
Normal prediction with DMatrix is now thread safe with locks.  Added inplace prediction is lock free thread safe.

When data is on device (cupy, cudf), the returned data is also on device.

* Implementation for numpy, csr, cudf and cupy.

* Implementation for dask.

* Remove sync in simple dmatrix.
2020-03-30 15:35:28 +08:00
Rory Mitchell
13b10a6370 Device dmatrix (#5420) 2020-03-28 14:42:21 +13:00
Jiaming Yuan
fc88105620 Better error message for updating. (#5418) 2020-03-15 16:46:21 +08:00
Jiaming Yuan
ab7a46a1a4 Check whether current updater can modify a tree. (#5406)
* Check whether current updater can modify a tree.

* Fix tree model JSON IO for pruned trees.
2020-03-14 09:24:08 +08:00
Jiaming Yuan
0110754a76 Remove update prediction cache from predictors. (#5312)
Move this function into gbtree, and uses only updater for doing so. As now the predictor knows exactly how many trees to predict, there's no need for it to update the prediction cache.
2020-02-17 11:35:47 +08:00
Jiaming Yuan
c35cdecddd Move prediction cache to Learner. (#5220)
* Move prediction cache into Learner.

* Clean-ups

- Remove duplicated cache in Learner and GBM.
- Remove ad-hoc fix of invalid cache.
- Remove `PredictFromCache` in predictors.
- Remove prediction cache for linear altogether, as it's only moving the
  prediction into training process but doesn't provide any actual overall speed
  gain.
- The cache is now unique to Learner, which means the ownership is no longer
  shared by any other components.

* Changes

- Add version to prediction cache.
- Use weak ptr to check expired DMatrix.
- Pass shared pointer instead of raw pointer.
2020-02-14 13:04:23 +08:00