Commit Graph

245 Commits

Author SHA1 Message Date
Honza Sterba
028ec5f028 Optionaly fail when gpu_id is set to invalid value (#6342) 2020-12-03 21:27:58 -08:00
Jiaming Yuan
8a17610666 Implement GPU predict leaf. (#6187) 2020-11-11 17:33:47 +08:00
Jiaming Yuan
43efadea2e Deterministic data partitioning for external memory (#6317)
* Make external memory data partitioning deterministic.

* Change the meaning of `page_size` from bytes to number of rows.

* Design a data pool.

* Note for external memory.

* Enable unity build on Windows CI.

* Force garbage collect on test.
2020-11-11 06:11:06 +08:00
Jiaming Yuan
519cee115a Avoid resetting seed for every configuration. (#6349) 2020-11-06 10:28:35 +08:00
Jiaming Yuan
2cc9662005 Support slicing tree model (#6302)
This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone.

* Implement the save_best option in early stopping.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-11-02 23:27:39 -08:00
Igor Moura
5e1e972aea Clean up warnings (#6325) 2020-10-30 23:50:29 +08:00
Sergio Gavilán
b181a88f9f Reduced some C++ compiler warnings (#6197)
* Removed some warnings

* Rebase with master

* Solved C++ Google Tests errors made by refactoring in order to remove warnings

* Undo renaming path -> path_

* Fix style check

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-29 12:36:00 -07:00
vcarpani
671971e12e Compiler warnings (#6286)
* Fix warnings for json.h

* Fix warnings for metric.h

* Fix warnings for updater_quantile_hist.cc.

* Fix warnings for updater_histmaker.cc.

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-28 13:46:15 -07:00
Jiaming Yuan
3310e208fd Fix inplace prediction interval. (#6259)
* Add back the interval in call.
* Make the interval non-optional.
2020-10-28 13:13:59 +08:00
Jiaming Yuan
bcfab4d726 Revert "Disable JSON full serialization for now. (#6248)" (#6266)
This reverts commit 6d293020fb.
2020-10-27 03:30:47 +08:00
Igor Moura
d1254808d5 Clean up C++ warnings (#6213) 2020-10-19 23:02:33 +08:00
Jiaming Yuan
6d293020fb Disable JSON full serialization for now. (#6248)
* Disable JSON serialization for now.

* Multi-class classification is checkpointing for each iteration.
This brings significant overhead.

Revert: 90355b4f00

* Set R tests to use binary.
2020-10-16 17:59:54 +08:00
vcarpani
6bc9747df5 Reduce compile warnings (#6198)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-08 23:14:59 +08:00
ShvetsKS
a4ce0eae43 CPU predict performance improvement (#6127)
Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
2020-10-08 15:50:21 +03:00
Rory Mitchell
dda9e1e487 Update GPUTreeshap (#6163)
* Reduce shap test duration

* Test interoperability with shap package

* Add feature interactions

* Update GPUTreeShap
2020-09-28 09:43:47 +13:00
Jiaming Yuan
7065779afa Improve JSON format for categorical features. (#6128)
* Gather categories for all nodes.
2020-09-21 15:35:05 +08:00
Jiaming Yuan
a069a21e03 Implement intrusive ptr (#6129)
* Use intrusive ptr for JSON.
2020-09-20 20:07:16 +08:00
Rory Mitchell
2e907abdb8 Updates to GPUTreeShap (#6087)
* Extract paths on device

* Update GPUTreeShap
2020-09-06 13:39:08 +12:00
Jiaming Yuan
2fcc4f2886 Unify evaluation functions. (#6037) 2020-08-26 14:23:27 +08:00
Jiaming Yuan
80c8547147 Make binary bin search reusable. (#6058)
* Move binary search row to hist util.
* Remove dead code.
2020-08-26 05:05:11 +08:00
Jiaming Yuan
81d8dd79ca Bump header version. (#6056) 2020-08-26 00:29:00 +08:00
Jiaming Yuan
20c95be625 Expand categorical node. (#6028)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-25 18:53:57 +08:00
Jiaming Yuan
90355b4f00 Make JSON the default full serialization format. (#6027) 2020-08-19 09:57:43 +08:00
Qi Zhang
989ddd036f Swap byte-order in binary serializer to support big-endian arch (#5813)
* fixed some endian issues

* Use dmlc::ByteSwap() to simplify code

* Fix lint check

* [CI] Add test for s390x

* Download latest CMake on s390x

* Fix a bug in my code

* Save magic number in dmatrix with byteswap on big-endian machine

* Save version in binary with byteswap on big-endian machine

* Load scalar with byteswap in MetaInfo

* Add a debugging message

* Handle arrays correctly when byteswapping

* EOF can also be 255

* Handle magic number in MetaInfo carefully

* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model

* Handle missing packages in Python tests

* Don't use boto3 in model compatibility tests

* Add s390 Docker file for local testing

* Add model compatibility tests

* Add R compatibility test

* Revert "Add R compatibility test"

This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.

Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 14:47:17 -07:00
Jiaming Yuan
4d99c58a5f Feature weights (#5962) 2020-08-18 19:55:41 +08:00
Jiaming Yuan
674c409e9d Remove rabit dependency on public headers. (#6005) 2020-08-13 08:26:20 +08:00
Jiaming Yuan
ee70a2380b Unify CPU hist sketching (#5880) 2020-08-12 01:33:06 +08:00
boxdot
d268a2a463 Thread-safe prediction by making the prediction cache thread-local. (#5853)
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-07-30 12:33:50 +08:00
Alexander Gugel
970b4b3fa2 Add XGBoosterGetNumFeature (#5856)
- add GetNumFeature to Learner
- add XGBoosterGetNumFeature to C API
- update c-api-demo accordingly
2020-07-13 23:25:17 -07:00
Jiaming Yuan
048d969be4 Implement GK sketching on GPU. (#5846)
* Implement GK sketching on GPU.
* Strong tests on quantile building.
* Handle sparse dataset by binary searching the column index.
* Hypothesis test on dask.
2020-07-07 12:16:21 +08:00
Jiaming Yuan
93c44a9a64 Move feature names and types of DMatrix from Python to C++. (#5858)
* Add thread local return entry for DMatrix.
* Save feature name and feature type in binary file.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-07 09:40:13 +08:00
Philip Hyunsu Cho
efe3e48ae2 Ensure that LoadSequentialFile() actually read the whole file (#5831) 2020-07-04 16:17:11 +08:00
Jiaming Yuan
1a0801238e Implement iterative DMatrix. (#5837) 2020-07-03 11:44:52 +08:00
Jiaming Yuan
90a9c68874 Implement a DMatrix Proxy. (#5803) 2020-06-29 15:03:10 +08:00
Jiaming Yuan
c4d721200a Implement extend method for meta info. (#5800)
* Implement extend for host device vector.
2020-06-20 03:32:03 +08:00
Jiaming Yuan
38ee514787 Implement fast number serialization routines. (#5772)
* Implement ryu algorithm.
* Implement integer printing.
* Full coverage roundtrip test.
2020-06-17 12:39:23 +08:00
Jiaming Yuan
1fa84b61c1 Implement Empty method for host device vector. (#5781)
* Fix accessing nullptr.
2020-06-13 19:02:26 +08:00
Philip Hyunsu Cho
1d22a9be1c Revert "Reorder includes. (#5749)" (#5771)
This reverts commit d3a0efbf16.
2020-06-09 10:29:28 -07:00
Jiaming Yuan
d3a0efbf16 Reorder includes. (#5749)
* Reorder includes.

* R.
2020-06-03 17:30:47 +12:00
ShvetsKS
cd3d14ad0e Add float32 histogram (#5624)
* new single_precision_histogram param was added.

Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
Co-authored-by: fis <jm.yuan@outlook.com>
2020-06-03 11:24:53 +08:00
Jiaming Yuan
325156c7a9 Bump version in header. (#5742) 2020-06-01 18:21:18 +08:00
Jiaming Yuan
21ed1f0c6d Support 64bit seed. (#5643) 2020-05-07 14:52:38 +08:00
Jiaming Yuan
67d267f9da Move device dmatrix construction code into ellpack. (#5623) 2020-05-06 19:43:59 +08:00
Jiaming Yuan
33e052b1e5 Remove dead code. (#5635) 2020-05-06 17:03:48 +08:00
Philip Hyunsu Cho
8de7f1928e Fix build on big endian CPUs (#5617)
* Fix build on big endian CPUs

* Clang-tidy
2020-04-29 21:56:34 -07:00
Jason E. Aten, Ph.D
8dfe7b3686 Clarify meaning of training parameter in XGBoosterPredict() (#5604)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-04-25 16:48:42 -07:00
Philip Hyunsu Cho
474cfddf91 [R] Address warnings to comply with CRAN submission policy (#5600)
* [R] Address warnings to comply with CRAN submission policy

* Include <xgboost/logging.h>
2020-04-25 13:34:36 -07:00
Jiaming Yuan
e726dd9902 Set device in device dmatrix. (#5596) 2020-04-25 13:42:53 +08:00
Philip Hyunsu Cho
ef26bc45bf Hide C++ symbols in libxgboost.so when building Python wheel (#5590)
* Hide C++ symbols in libxgboost.so when building Python wheel

* Update Jenkinsfile

* Add test

* Upgrade rabit

* Add setup.py option.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-04-24 13:32:05 -07:00
Jiaming Yuan
29a4cfe400 Group aware GPU sketching. (#5551)
* Group aware GPU weighted sketching.

* Distribute group weights to each data point.
* Relax the test.
* Validate input meta info.
* Fix metainfo copy ctor.
2020-04-20 17:18:52 +08:00