xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	c8ec62103a	Deprecate LabelEncoder in XGBClassifier; Enable cuDF/cuPy inputs in XGBClassifier (#6269 ) * Deprecate LabelEncoder in XGBClassifier; skip LabelEncoder for cuDF/cuPy inputs * Add unit tests for cuDF and cuPy inputs with XGBClassifier * Fix lint * Clarify warning * Move use_label_encoder option to XGBClassifier constructor * Add a test for cudf.Series * Add use_label_encoder to XGBRFClassifier doc * Address reviewer feedback	2020-10-26 13:20:51 -07:00
Jiaming Yuan	5037abeb86	Fix linear gpu input (#6255 )	2020-10-19 12:02:36 +08:00
Philip Hyunsu Cho	65ea42bd42	[CI] Reduce testing load with RMM (#6249 ) * [CI] Reduce testing load with RMM * Address reviewer's comment	2020-10-18 19:16:46 -07:00
Jiaming Yuan	ab5b35134f	Rework Python callback functions. (#6199 ) * Define a new callback interface for Python. * Deprecate the old callbacks. * Enable early stopping on dask.	2020-10-10 17:52:36 +08:00
Jiaming Yuan	b5b24354b8	More categorical tests and disable shap sparse test. (#6219 ) * Fix tree load with 32 category.	2020-10-10 16:12:37 +08:00
Jiaming Yuan	70ce5216b5	Add high level tests for categorical data. (#6179 ) * Fix unique.	2020-10-09 09:27:23 +08:00
Rory Mitchell	dda9e1e487	Update GPUTreeshap (#6163 ) * Reduce shap test duration * Test interoperability with shap package * Add feature interactions * Update GPUTreeShap	2020-09-28 09:43:47 +13:00
Jiaming Yuan	78d72ef936	Add DaskDeviceQuantileDMatrix demo. (#6156 )	2020-09-24 14:08:28 +08:00
Jiaming Yuan	2fcc4f2886	Unify evaluation functions. (#6037 )	2020-08-26 14:23:27 +08:00
Rory Mitchell	9a4e8b1d81	GPUTreeShap (#6038 )	2020-08-25 12:47:41 +12:00
Jiaming Yuan	4d99c58a5f	Feature weights (#5962 )	2020-08-18 19:55:41 +08:00
Philip Hyunsu Cho	14d5ce712c	[CI] Fix Dask Pytest fixture (#6024 )	2020-08-17 16:45:22 -07:00
Philip Hyunsu Cho	9adb812a0a	RMM integration plugin (#5873 ) * [CI] Add RMM as an optional dependency * Replace caching allocator with pool allocator from RMM * Revert "Replace caching allocator with pool allocator from RMM" This reverts commit e15845d4e72e890c2babe31a988b26503a7d9038. * Use rmm::mr::get_default_resource() * Try setting default resource (doesn't work yet) * Allocate pool_mr in the heap * Prevent leaking pool_mr handle * Separate EXPECT_DEATH() in separate test suite suffixed DeathTest * Turn off death tests for RMM * Address reviewer's feedback * Prevent leaking of cuda_mr * Fix Jenkinsfile syntax * Remove unnecessary function in Jenkinsfile * [CI] Install NCCL into RMM container * Run Python tests * Try building with RMM, CUDA 10.0 * Do not use RMM for CUDA 10.0 target * Actually test for test_rmm flag * Fix TestPythonGPU * Use CNMeM allocator, since pool allocator doesn't yet support multiGPU * Use 10.0 container to build RMM-enabled XGBoost * Revert "Use 10.0 container to build RMM-enabled XGBoost" This reverts commit 789021fa31112e25b683aef39fff375403060141. * Fix Jenkinsfile * [CI] Assign larger /dev/shm to NCCL * Use 10.2 artifact to run multi-GPU Python tests * Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target * Rename Conda env rmm_test -> gpu_test * Use env var to opt into CNMeM pool for C++ tests * Use identical CUDA version for RMM builds and tests * Use Pytest fixtures to enable RMM pool in Python tests * Move RMM to plugin/CMakeLists.txt; use PLUGIN_RMM * Use per-device MR; use command arg in gtest * Set CMake prefix path to use Conda env * Use 0.15 nightly version of RMM * Remove unnecessary header * Fix a unit test when cudf is missing * Add RMM demos * Remove print() * Use HostDeviceVector in GPU predictor * Simplify pytest setup; use LocalCUDACluster fixture * Address reviewers' commments Co-authored-by: Hyunsu Cho <chohyu01@cs.wasshington.edu>	2020-08-12 01:26:02 -07:00
Jiaming Yuan	ee70a2380b	Unify CPU hist sketching (#5880 )	2020-08-12 01:33:06 +08:00
Philip Hyunsu Cho	bf2990e773	Add missing Pytest marks to AsyncIO unit test (#5968 )	2020-08-01 10:56:24 +08:00
Jiaming Yuan	fa3715f584	[Dask] Asyncio support. (#5862 )	2020-07-30 06:23:58 +08:00
Philip Hyunsu Cho	ac9136ee49	Further improvements and savings in Jenkins pipeline (#5904 ) * Publish artifacts only on the master and release branches * Build CUDA only for Compute Capability 7.5 when building PRs * Run all Windows jobs in a single worker image * Build nightly XGBoost4J SNAPSHOT JARs with Scala 2.12 only * Show skipped Python tests on Windows * Make Graphviz optional for Python tests * Add back C++ tests * Unstash xgboost_cpp_tests * Fix label to CUDA 10.1 * Install cuPy for CUDA 10.1 * Install jsonschema * Address reviewer's feedback	2020-07-18 03:30:40 -07:00
Jiaming Yuan	7c2686146e	Dask device dmatrix (#5901 ) * Fix softprob with empty dmatrix.	2020-07-17 13:17:43 +08:00
Jiaming Yuan	029a8b533f	Simplify the data backends. (#5893 )	2020-07-16 15:17:31 +08:00
Jiaming Yuan	a3ec964346	Accept iterator in device dmatrix. (#5783 ) * Remove Device DMatrix.	2020-07-07 21:44:48 +08:00
Jiaming Yuan	048d969be4	Implement GK sketching on GPU. (#5846 ) * Implement GK sketching on GPU. * Strong tests on quantile building. * Handle sparse dataset by binary searching the column index. * Hypothesis test on dask.	2020-07-07 12:16:21 +08:00
Rory Mitchell	abdf894fcf	Add cupy to Windows CI (#5797 ) * Add cupy to Windows CI * Update Jenkinsfile-win64 Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * Update Jenkinsfile-win64 Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> * Update tests/python-gpu/test_gpu_prediction.py Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-06-17 21:55:09 -07:00
Rory Mitchell	b47b5ac771	Use hypothesis (#5759 ) * Use hypothesis * Allow int64 array interface for groups * Add packages to Windows CI * Add to travis * Make sure device index is set correctly * Fix dask-cudf test * appveyor	2020-06-16 12:45:59 +12:00
Rory Mitchell	359023c0fa	Speed up python test (#5752 ) * Speed up tests * Prevent DeviceQuantileDMatrix initialisation with numpy * Use joblib.memory * Use RandomState	2020-06-05 11:39:24 +12:00
Jiaming Yuan	35e2205256	[dask] Return GPU Series when input is from cuDF. (#5710 ) * Refactor predict function.	2020-05-28 17:51:20 +08:00
Jiaming Yuan	5af8161a1a	Implement Python data handler. (#5689 ) * Define data handlers for DMatrix. * Throw ValueError in scikit learn interface.	2020-05-22 11:53:55 +08:00
Jiaming Yuan	9ad40901a8	Upgrade to CUDA 10.0 (#5649 ) (#5652 ) Co-authored-by: fis <jm.yuan@outlook.com> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-05-11 22:27:36 +08:00
Rory Mitchell	b9649e7b8e	Refactor gpu_hist split evaluation (#5610 ) * Refactor * Rewrite evaluate splits * Add more tests	2020-04-30 08:58:12 +12:00
Jiaming Yuan	7d93932423	Better message when no GPU is found. (#5594 )	2020-04-26 10:00:57 +08:00
Jiaming Yuan	e726dd9902	Set device in device dmatrix. (#5596 )	2020-04-25 13:42:53 +08:00
Jiaming Yuan	29a4cfe400	Group aware GPU sketching. (#5551 ) * Group aware GPU weighted sketching. * Distribute group weights to each data point. * Relax the test. * Validate input meta info. * Fix metainfo copy ctor.	2020-04-20 17:18:52 +08:00
Jiaming Yuan	8b04736b81	[dask] dask cudf inplace prediction. (#5512 ) * Add inplace prediction for dask-cudf. * Remove Dockerfile.release, since it's not used anywhere * Use Conda exclusively in CUDF and GPU containers * Improve cupy memory copying. * Add skip marks to tests. * Add mgpu-cudf category on the CI to run all distributed tests. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-04-15 18:15:51 +08:00
Rory Mitchell	15f40e51e9	Add support for dlpack, expose python docs for DeviceQuantileDMatrix (#5465 )	2020-04-01 23:34:32 +13:00
Jiaming Yuan	6601a641d7	Thread safe, inplace prediction. (#5389 ) Normal prediction with DMatrix is now thread safe with locks. Added inplace prediction is lock free thread safe. When data is on device (cupy, cudf), the returned data is also on device. * Implementation for numpy, csr, cudf and cupy. * Implementation for dask. * Remove sync in simple dmatrix.	2020-03-30 15:35:28 +08:00
Rory Mitchell	13b10a6370	Device dmatrix (#5420 )	2020-03-28 14:42:21 +13:00
Jiaming Yuan	0dd97c206b	Move thread local entry into Learner. (#5396 ) * Move thread local entry into Learner. This is an attempt to workaround CUDA context issue in static variable, where the CUDA context can be released before device vector. * Add PredictionEntry to thread local entry. This eliminates one copy of prediction vector. * Don't define CUDA C API in a namespace.	2020-03-07 15:37:39 +08:00
Jiaming Yuan	8d06878bf9	Deterministic GPU histogram. (#5361 ) * Use pre-rounding based method to obtain reproducible floating point summation. * GPU Hist for regression and classification are bit-by-bit reproducible. * Add doc. * Switch to thrust reduce for `node_sum_gradient`.	2020-03-04 15:13:28 +08:00
Rory Mitchell	24ad9dec0b	Testing hist_util (#5251 ) * Rank tests * Remove categorical split specialisation * Extend tests to multiple features, switch to WQSketch * Add tests for SparseCuts * Add external memory quantile tests, fix some existing tests	2020-02-14 14:36:43 +13:00
Rory Mitchell	1b3947d929	Make some GPU tests deterministic (#5229 )	2020-01-26 11:53:07 +13:00
Rory Mitchell	aa9a68010b	uint not supported in cudf (#5225 )	2020-01-23 16:59:18 +13:00
Rory Mitchell	9c56480c61	Support dmatrix construction from cupy array (#5206 )	2020-01-22 13:15:27 +13:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Jiaming Yuan	ebc86a3afa	Disable parameter validation for Scikit-Learn interface. (#5167 ) * Disable parameter validation for now. Scikit-Learn passes all parameters down to XGBoost, whether they are used or not. * Add option `validate_parameters`.	2020-01-07 11:17:31 +08:00
Jiaming Yuan	61286c6e8f	Fix wrapping GPU ID and prevent data copying. (#5160 ) * Removed some data copying. * Make sure gpu_id is valid before any configuration is carried out.	2019-12-27 16:51:08 +08:00
sriramch	ee81ba8e1f	implementation of map ranking algorithm on gpu (#5129 ) * - implementation of map ranking algorithm - also effected necessary suggestions mentioned in the earlier ranking pr's - made some performance improvements to the ndcg algo as well	2019-12-27 12:05:37 +13:00
Jiaming Yuan	ced3660f60	Tests for empty dmatrix. (#5159 )	2019-12-26 11:51:54 +08:00
Jiaming Yuan	298ebe68ac	[Breaking] Remove `learning_rates` in Python. (#5155 ) * Remove `learning_rates`. It's been deprecated since we have callback. * Set `before_iteration` of `reset_learning_rate` to False to preserve the initial learning rate, and comply to the term "reset". Closes #4709. * Tests for various `tree_method`.	2019-12-24 14:25:48 +08:00
Jiaming Yuan	b915788708	Remove benchmark code in GPU test. (#5141 ) * Update Jenkins script.	2019-12-21 11:00:21 +08:00
Jiaming Yuan	3136185bc5	JSON configuration IO. (#5111 ) * Add saving/loading JSON configuration. * Implement Python pickle interface with new IO routines. * Basic tests for training continuation.	2019-12-15 17:31:53 +08:00
Jiaming Yuan	608ebbe444	Fix GPU ID and prediction cache from pickle (#5086 ) * Hack for saving GPU ID. * Declare prediction cache on GBTree. * Add a simple test. * Add `auto` option for GPU Predictor.	2019-12-07 16:02:06 +08:00

1 2 3 4 5

242 Commits