Qi Zhang
989ddd036f
Swap byte-order in binary serializer to support big-endian arch ( #5813 )
...
* fixed some endian issues
* Use dmlc::ByteSwap() to simplify code
* Fix lint check
* [CI] Add test for s390x
* Download latest CMake on s390x
* Fix a bug in my code
* Save magic number in dmatrix with byteswap on big-endian machine
* Save version in binary with byteswap on big-endian machine
* Load scalar with byteswap in MetaInfo
* Add a debugging message
* Handle arrays correctly when byteswapping
* EOF can also be 255
* Handle magic number in MetaInfo carefully
* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model
* Handle missing packages in Python tests
* Don't use boto3 in model compatibility tests
* Add s390 Docker file for local testing
* Add model compatibility tests
* Add R compatibility test
* Revert "Add R compatibility test"
This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.
Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 14:47:17 -07:00
Jiaming Yuan
4d99c58a5f
Feature weights ( #5962 )
2020-08-18 19:55:41 +08:00
Jiaming Yuan
0b2a26fa74
Remove skmaker. ( #5971 )
2020-08-09 15:23:31 +08:00
boxdot
d268a2a463
Thread-safe prediction by making the prediction cache thread-local. ( #5853 )
...
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-07-30 12:33:50 +08:00
Jiaming Yuan
e4a273e1da
Fix evaluate root split. ( #5948 )
2020-07-29 19:33:29 +08:00
Philip Hyunsu Cho
ace7fd328b
[R] Add a compatibility layer to load Booster object from an old RDS file ( #5940 )
...
* [R] Add a compatibility layer to load Booster from an old RDS
* Modify QuantileHistMaker::LoadConfig() to be backward compatible with 1.1.x
* Add a big warning about compatibility in QuantileHistMaker::LoadConfig()
* Add testing suite
* Discourage use of saveRDS() in CRAN doc
2020-07-26 00:06:49 -07:00
Jiaming Yuan
a4de2f68e4
Use cudaOccupancyMaxPotentialBlockSize to calculate the block size. ( #5926 )
2020-07-23 14:24:42 +08:00
Philip Hyunsu Cho
4af857f95d
Add explicit template specialization for portability ( #5921 )
...
* Add explicit template specializations
* Adding Specialization for FileAdapterBatch
2020-07-22 12:31:17 -07:00
Andy Adinets
ac3f0e78dc
Split Features into Groups to Compute Histograms in Shared Memory ( #5795 )
2020-07-07 15:04:35 +12:00
Philip Hyunsu Cho
1d22a9be1c
Revert "Reorder includes. ( #5749 )" ( #5771 )
...
This reverts commit d3a0efbf162f3dceaaf684109e1178c150b32de3.
2020-06-09 10:29:28 -07:00
Jiaming Yuan
d3a0efbf16
Reorder includes. ( #5749 )
...
* Reorder includes.
* R.
2020-06-03 17:30:47 +12:00
ShvetsKS
cd3d14ad0e
Add float32 histogram ( #5624 )
...
* new single_precision_histogram param was added.
Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
Co-authored-by: fis <jm.yuan@outlook.com>
2020-06-03 11:24:53 +08:00
Rory Mitchell
f779980f7e
gpu_hist performance tweaks ( #5707 )
...
* Remove device vectors
* Remove allreduce synchronize
* Remove double buffer
2020-05-29 16:48:53 +12:00
Andy Adinets
646def51e0
C++14 for xgboost ( #5664 )
2020-05-21 12:26:40 +12:00
ShvetsKS
dd01e4ba8d
Distributed optimizations for 'hist' method with CPUs ( #5557 )
...
Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
2020-05-20 06:03:03 +03:00
Jiaming Yuan
535479e69f
Add JSON schema to model dump. ( #5660 )
2020-05-15 10:18:43 +08:00
Oleksandr Kuvshynov
4e64e2ef8e
skip missing lookup if nothing is missing in CPU hist partition kernel. ( #5644 )
...
* [xgboost] skip missing lookup if nothing is missing
2020-05-12 05:50:08 +03:00
Rory Mitchell
fcf57823b6
Reduce device synchronisation ( #5631 )
...
* Reduce device synchronisation
* Initialise pinned memory
2020-05-07 21:19:46 +12:00
Jiaming Yuan
eaf2a00b5c
Enhance nvtx support. ( #5636 )
2020-05-06 22:54:24 +08:00
Rory Mitchell
b9649e7b8e
Refactor gpu_hist split evaluation ( #5610 )
...
* Refactor
* Rewrite evaluate splits
* Add more tests
2020-04-30 08:58:12 +12:00
Jiaming Yuan
c90457f489
Refactor the CLI. ( #5574 )
...
* Enable parameter validation.
* Enable JSON.
* Catch `dmlc::Error`.
* Show help message.
2020-04-26 10:56:33 +08:00
Andy Adinets
73142041b9
For histograms, opting into maximum shared memory available per block. ( #5491 )
2020-04-21 14:56:42 +12:00
Rory Mitchell
b2827a80e1
Use non-synchronising scan ( #5560 )
2020-04-20 15:51:34 +12:00
Rory Mitchell
d6d1035950
gpu_hist performance fixes ( #5558 )
...
* Remove unnecessary cuda API calls
* Fix histogram memory growth
2020-04-19 12:21:13 +12:00
Jiaming Yuan
c245eb8755
Fix r interaction constraints ( #5543 )
...
* Unify the parsing code.
* Cleanup.
2020-04-18 06:53:51 +08:00
ShvetsKS
a2d86b8e4b
Optimizations for RNG in InitData kernel ( #5522 )
...
* optimizations for subsampling in InitData
* optimizations for subsampling in InitData
Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
2020-04-16 18:24:32 +03:00
Rory Mitchell
e268fb0093
Use thrust functions instead of custom functions ( #5544 )
2020-04-16 21:41:16 +12:00
Rory Mitchell
ca4e05660e
Purge device_helpers.cuh ( #5534 )
...
* Simplifications with caching_device_vector
* Purge device helpers
2020-04-15 21:51:56 +12:00
Jiaming Yuan
866a477319
Unify max nodes. ( #5497 )
2020-04-10 19:26:35 +08:00
Jiaming Yuan
bd653fad4c
Remove distcol updater. ( #5507 )
...
Closes #5498 .
2020-04-10 12:52:56 +08:00
Jiaming Yuan
7d52c0b8c2
Requires setting leaf stat when expanding tree. ( #5501 )
...
* Fix GPU Hist feature importance.
2020-04-10 12:27:03 +08:00
Jiaming Yuan
0012f2ef93
Upgrade clang-tidy on CI. ( #5469 )
...
* Correct all clang-tidy errors.
* Upgrade clang-tidy to 10 on CI.
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-04-05 04:42:29 +08:00
Jiaming Yuan
939973630d
Accept other gradient types for split entry. ( #5467 )
2020-04-03 10:38:44 +08:00
ShvetsKS
27a8e36fc3
Reducing memory consumption for 'hist' method on CPU ( #5334 )
2020-03-28 14:45:52 +13:00
Jiaming Yuan
ab7a46a1a4
Check whether current updater can modify a tree. ( #5406 )
...
* Check whether current updater can modify a tree.
* Fix tree model JSON IO for pruned trees.
2020-03-14 09:24:08 +08:00
Rory Mitchell
b745b7acce
Fix memory usage of device sketching ( #5407 )
2020-03-14 13:43:24 +13:00
Rory Mitchell
3ad4333b0e
Partial rewrite EllpackPage ( #5352 )
2020-03-11 10:15:53 +13:00
Jiaming Yuan
8d06878bf9
Deterministic GPU histogram. ( #5361 )
...
* Use pre-rounding based method to obtain reproducible floating point
summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
2020-03-04 15:13:28 +08:00
Egor Smirnov
1b97eaf7a7
Optimized ApplySplit, BuildHist and UpdatePredictCache functions on CPU ( #5244 )
...
* Split up sparse and dense build hist kernels.
* Add `PartitionBuilder`.
2020-02-29 16:11:42 +08:00
Jiaming Yuan
e0509b3307
Fix pruner. ( #5335 )
...
* Honor the tree depth.
* Prevent pruning pruned node.
2020-02-25 08:32:46 +08:00
Rory Mitchell
b0ed3f0a66
Remove unnecessary DMatrix methods ( #5324 )
2020-02-25 12:40:39 +13:00
Jiaming Yuan
655cf17b60
Predict on Ellpack. ( #5327 )
...
* Unify GPU prediction node.
* Add `PageExists`.
* Dispatch prediction on input data for GPU Predictor.
2020-02-23 06:27:03 +08:00
Rong Ou
e4b74c4d22
Gradient based sampling for GPU Hist ( #5093 )
...
* Implement gradient based sampling for GPU Hist tree method.
* Add samplers and handle compacted page in GPU Hist.
2020-02-04 10:31:27 +08:00
Egor Smirnov
c67163250e
Optimized BuildHist function ( #5156 )
2020-01-29 23:32:57 -08:00
Jiaming Yuan
7b65698187
Enforce correct data shape. ( #5191 )
...
* Fix syncing DMatrix columns.
* notes for tree method.
* Enable feature validation for all interfaces except for jvm.
* Better tests for boosting from predictions.
* Disable validation on JVM.
2020-01-13 15:48:17 +08:00
Egor Smirnov
7b17e76c5b
Optimized EvaluateSplut function ( #5138 )
...
* Add block based threading utilities.
2019-12-31 18:18:42 +08:00
Jiaming Yuan
04db125699
Quick fix for memory leak in CPU Hist. ( #5153 )
...
Closes https://github.com/dmlc/xgboost/issues/3579 .
* Don't use map.
2019-12-31 14:05:53 +08:00
Jiaming Yuan
139ccc9902
Fix num_roots to be 1. ( #5165 )
2019-12-30 02:18:45 +08:00
Jiaming Yuan
f3d7877802
Parameter validation ( #5157 )
...
* Unused code.
* Split up old colmaker parameters from train param.
* Fix dart.
* Better name.
2019-12-26 11:59:05 +08:00
Jiaming Yuan
3136185bc5
JSON configuration IO. ( #5111 )
...
* Add saving/loading JSON configuration.
* Implement Python pickle interface with new IO routines.
* Basic tests for training continuation.
2019-12-15 17:31:53 +08:00