xgboost

Author	SHA1	Message	Date
Jiaming Yuan	29b7fea572	Optimize cpu sketch allreduce for sparse data. (#6009 ) * Bypass RABIT serialization reducer and use custom allgather based merging.	2020-08-19 10:03:45 +08:00
Jiaming Yuan	90355b4f00	Make JSON the default full serialization format. (#6027 )	2020-08-19 09:57:43 +08:00
Anthony D'Amato	f58e41bad8	Fix deterministic partitioning with dataset containing Double.NaN (#5996 ) The functions featureValueOfSparseVector or featureValueOfDenseVector could return a Float.NaN if the input vectore was containing any missing values. This would make fail the partition key computation and most of the vectors would end up in the same partition. We fix this by avoid returning a NaN and simply use the row HashCode in this case. We added a test to ensure that the repartition is indeed now uniform on input dataset containing values by checking that the partitions size variance is below a certain threshold. Signed-off-by: Anthony D'Amato <anthony.damato@hotmail.fr>	2020-08-18 18:55:37 -07:00
Cuong Duong	e51cba6195	Add SHAP summary plot using ggplot2 (#5882 ) * add SHAP summary plot using ggplot2 * Update xgb.plot.shap * Update example in xgb.plot.shap documentation * update logic, add tests * whitespace fixes * whitespace fixes for test_helpers * namespace for sd function * explicitly declare variables that are automatically evaluated by data.table * Fix R lint Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-08-18 18:04:09 -07:00
Qi Zhang	989ddd036f	Swap byte-order in binary serializer to support big-endian arch (#5813 ) * fixed some endian issues * Use dmlc::ByteSwap() to simplify code * Fix lint check * [CI] Add test for s390x * Download latest CMake on s390x * Fix a bug in my code * Save magic number in dmatrix with byteswap on big-endian machine * Save version in binary with byteswap on big-endian machine * Load scalar with byteswap in MetaInfo * Add a debugging message * Handle arrays correctly when byteswapping * EOF can also be 255 * Handle magic number in MetaInfo carefully * Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model * Handle missing packages in Python tests * Don't use boto3 in model compatibility tests * Add s390 Docker file for local testing * Add model compatibility tests * Add R compatibility test * Revert "Add R compatibility test" This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8. Co-authored-by: Qi Zhang <q.zhang@ibm.com> Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-08-18 14:47:17 -07:00
Jiaming Yuan	4d99c58a5f	Feature weights (#5962 )	2020-08-18 19:55:41 +08:00
Jiaming Yuan	a418278064	Merge pull request #6023 from trivialfis/merge-rabit Merge rabit	2020-08-18 09:01:56 +08:00
Philip Hyunsu Cho	14d5ce712c	[CI] Fix Dask Pytest fixture (#6024 )	2020-08-17 16:45:22 -07:00
fis	111968ca58	Merge rabit	2020-08-18 03:52:33 +08:00
fis	1c5904df3f	Remove rabit.	2020-08-18 03:48:36 +08:00
Jiaming Yuan	d240463b38	Revert "Remove warning about memset. (#6003 )" (#6020 ) This reverts commit 12e3fb6a6cb601c58d39ed54253e6e50f1513ccc.	2020-08-17 20:10:15 +08:00
Philip Hyunsu Cho	511bb22ffd	[Doc] Add dtreeviz as a showcase example of integration with 3rd-party software (#6013 )	2020-08-13 20:53:59 -07:00
Philip Hyunsu Cho	e3ec7b01df	[CI] Cancel builds on subsequent pushes (#6011 ) * [CI] Cancel builds on subsequent pushes * Use a more secure method * test commit	2020-08-13 11:17:39 -07:00
Jiaming Yuan	674c409e9d	Remove rabit dependency on public headers. (#6005 )	2020-08-13 08:26:20 +08:00
Jiaming Yuan	12e3fb6a6c	Remove warning about memset. (#6003 )	2020-08-13 08:25:46 +08:00
Philip Hyunsu Cho	9adb812a0a	RMM integration plugin (#5873 ) * [CI] Add RMM as an optional dependency * Replace caching allocator with pool allocator from RMM * Revert "Replace caching allocator with pool allocator from RMM" This reverts commit e15845d4e72e890c2babe31a988b26503a7d9038. * Use rmm::mr::get_default_resource() * Try setting default resource (doesn't work yet) * Allocate pool_mr in the heap * Prevent leaking pool_mr handle * Separate EXPECT_DEATH() in separate test suite suffixed DeathTest * Turn off death tests for RMM * Address reviewer's feedback * Prevent leaking of cuda_mr * Fix Jenkinsfile syntax * Remove unnecessary function in Jenkinsfile * [CI] Install NCCL into RMM container * Run Python tests * Try building with RMM, CUDA 10.0 * Do not use RMM for CUDA 10.0 target * Actually test for test_rmm flag * Fix TestPythonGPU * Use CNMeM allocator, since pool allocator doesn't yet support multiGPU * Use 10.0 container to build RMM-enabled XGBoost * Revert "Use 10.0 container to build RMM-enabled XGBoost" This reverts commit 789021fa31112e25b683aef39fff375403060141. * Fix Jenkinsfile * [CI] Assign larger /dev/shm to NCCL * Use 10.2 artifact to run multi-GPU Python tests * Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target * Rename Conda env rmm_test -> gpu_test * Use env var to opt into CNMeM pool for C++ tests * Use identical CUDA version for RMM builds and tests * Use Pytest fixtures to enable RMM pool in Python tests * Move RMM to plugin/CMakeLists.txt; use PLUGIN_RMM * Use per-device MR; use command arg in gtest * Set CMake prefix path to use Conda env * Use 0.15 nightly version of RMM * Remove unnecessary header * Fix a unit test when cudf is missing * Add RMM demos * Remove print() * Use HostDeviceVector in GPU predictor * Simplify pytest setup; use LocalCUDACluster fixture * Address reviewers' commments Co-authored-by: Hyunsu Cho <chohyu01@cs.wasshington.edu>	2020-08-12 01:26:02 -07:00
Jiaming Yuan	c3ea3b7e37	Fix nightly build doc. [skip ci] (#6004 ) * Fix nightly build doc. [skip ci] * Fix title too short. [skip ci]	2020-08-12 15:00:40 +08:00
Jiaming Yuan	ee70a2380b	Unify CPU hist sketching (#5880 )	2020-08-12 01:33:06 +08:00
jameskrach	bd6b7f4aa7	[Breaking] Fix .predict() method and add .predict_proba() in xgboost.dask.DaskXGBClassifier (#5986 )	2020-08-11 16:11:28 +08:00
Jiaming Yuan	6f7112a848	Move warning about empty dataset. (#5998 )	2020-08-11 14:10:51 +08:00
Jiaming Yuan	f93f1c03fc	Rabit update. (#5978 ) * Remove parameter on JVM Packages.	2020-08-11 09:17:32 +08:00
Jiaming Yuan	0b2a26fa74	Remove skmaker. (#5971 )	2020-08-09 15:23:31 +08:00
Vladislav Epifanov	388f975cf5	Introducing DPC++-based plugin (predictor, objective function) supporting oneAPI programming model (#5825 ) * Added plugin with DPC++-based predictor and objective function * Update CMakeLists.txt * Update regression_obj_oneapi.cc * Added README.md for OneAPI plugin * Added OneAPI predictor support to gbtree * Update README.md * Merged kernels in gradient computation. Enabled multiple loss functions with DPC++ backend * Aligned plugin CMake files with latest master changes. Fixed whitespace typos * Removed debug output * [CI] Make oneapi_plugin a CMake target * Added tests for OneAPI plugin for predictor and obj. functions * Temporarily switched to default selector for device dispacthing in OneAPI plugin to enable execution in environments without gpus * Updated readme file. * Fixed USM usage in predictor * Removed workaround with explicit templated names for DPC++ kernels * Fixed warnings in plugin tests * Fix CMake build of gtest Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-08-08 18:40:40 -07:00
Anthony D'Amato	7cf3e9be59	Fix typo in tracker logging (#5994 )	2020-08-09 03:45:46 +08:00
James Lamb	589b385ec6	[R] fix uses of 1:length(x) and other small things (#5992 )	2020-08-09 03:31:33 +08:00
Jiaming Yuan	801e6b6800	Fix dask predict shape infer. (#5989 )	2020-08-08 14:29:22 +08:00
Jiaming Yuan	4acdd7c6f6	Remove stop process. (#143 )	2020-08-05 10:12:00 -07:00
Jiaming Yuan	9c6e791e64	Enforce tree order in JSON. (#5974 ) * Make JSON model IO more future proof by using tree id in model loading.	2020-08-05 16:44:52 +08:00
Jiaming Yuan	dde9c5aaff	Fix missing data warning. (#5969 ) * Fix data warning. * Add numpy/scipy test.	2020-08-05 16:19:12 +08:00
Jiaming Yuan	8599f87597	Update JSON schema. (#5982 ) * Update JSON schema for pseudo huber. * Update JSON model schema.	2020-08-05 15:21:11 +08:00
Jiaming Yuan	9c93531709	Update Python custom objective demo. (#5981 )	2020-08-05 12:27:19 +08:00
Jiaming Yuan	1149a7a292	Fix sklearn doc. (#5980 )	2020-08-05 12:26:19 +08:00
Jiaming Yuan	b069431c28	Export DaskDeviceQuantileDMatrix in doc. [skip ci] (#5975 )	2020-08-05 00:48:10 +08:00
Shaochen Shi	71197d1dfa	[jvm-packages] Fix wrong method name `setAllowZeroForMissingValue`. (#5740 ) * Allow non-zero for missing value when training. * Fix wrong method names. * Add a unit test * Move the getter/setter unit test to MissingValueHandlingSuite Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-08-01 17:16:42 -07:00
Philip Hyunsu Cho	5a2dcd1c33	[R] Provide better guidance for persisting XGBoost model (#5964 ) * [R] Provide better guidance for persisting XGBoost model * Update saving_model.rst * Add a paragraph about xgb.serialize()	2020-07-31 20:00:26 -07:00
Philip Hyunsu Cho	bf2990e773	Add missing Pytest marks to AsyncIO unit test (#5968 )	2020-08-01 10:56:24 +08:00
Philip Hyunsu Cho	5f3c811e84	[CI] Assign larger /dev/shm to NCCL (#5966 ) * [CI] Assign larger /dev/shm to NCCL * Use 10.2 artifact to run multi-GPU Python tests * Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target	2020-07-31 10:05:04 -07:00
Philip Hyunsu Cho	3fcfaad577	Add CMake flag to log C API invocations, to aid debugging (#5925 ) * Add CMake flag to log C API invocations, to aid debugging * Remove unnecessary parentheses	2020-07-30 19:24:28 -07:00
James Bourbeau	3b88bc948f	Update XGBoost + Dask overview documentation (#5961 ) * Add imports to code snippet * Better writing.	2020-07-31 09:58:50 +08:00
Jiaming Yuan	70903c872f	Force colored output for ninja build. (#5959 )	2020-07-30 20:48:03 +08:00
boxdot	d268a2a463	Thread-safe prediction by making the prediction cache thread-local. (#5853 ) Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-07-30 12:33:50 +08:00
Jiaming Yuan	fa3715f584	[Dask] Asyncio support. (#5862 )	2020-07-30 06:23:58 +08:00
Jiaming Yuan	e4a273e1da	Fix evaluate root split. (#5948 )	2020-07-29 19:33:29 +08:00
Philip Hyunsu Cho	071e10c1d1	[CI] Fix broken Docker container 'cpu' (#5956 )	2020-07-29 04:29:57 -07:00
Jiaming Yuan	f5fdcbe194	Disable feature validation on sklearn predict prob. (#5953 ) * Fix issue when scikit learn interface receives transformed inputs.	2020-07-29 19:26:44 +08:00
Jiaming Yuan	18349a7ccf	[Breaking] Fix custom metric for multi output. (#5954 ) * Set output margin to true for custom metric. This fixes only R and Python.	2020-07-29 19:25:27 +08:00
Jiaming Yuan	75b8c22b0b	Fix prediction heuristic (#5955 ) * Relax check for prediction. * Relax test in spark test. * Add tests in C++.	2020-07-29 19:24:07 +08:00
Philip Hyunsu Cho	5879acde9a	[CI] Improve R linter script (#5944 ) * [CI] Move lint to a separate script * [CI] Improved lintr launcher * Add lintr as a separate action * Add custom parsing logic to print out logs * Fix lintr issues in demos * Run R demos * Fix CRAN checks * Install XGBoost into R env before running lintr * Install devtools (needed to run demos)	2020-07-27 00:55:35 -07:00
Bobby Wang	8943eb4314	[BLOCKING] [jvm-packages] add gpu_hist and enable gpu scheduling (#5171 ) * [jvm-packages] add gpu_hist tree method * change updater hist to grow_quantile_histmaker * add gpu scheduling * pass correct parameters to xgboost library * remove debug info * add use.cuda for pom * add CI for gpu_hist for jvm * add gpu unit tests * use gpu node to build jvm * use nvidia-docker * Add CLI interface to create_jni.py using argparse Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-07-26 21:53:24 -07:00
Philip Hyunsu Cho	6347fa1c2e	[R] Enable weighted learning to rank (#5945 ) * [R] enable weighted learning to rank * Add R unit test for ranking * Fix lint	2020-07-26 21:10:36 -07:00

1 2 3 4 5 ...

4932 Commits