xgboost

Author	SHA1	Message	Date
Jiaming Yuan	f6fe15d11f	Improve parameter validation (#6769 ) * Add quotes to unused parameters. * Check for whitespace.	2021-03-20 01:56:55 +08:00
Jiaming Yuan	23b4165a6b	Fix gamma deviance (#6761 )	2021-03-20 01:56:17 +08:00
Philip Hyunsu Cho	4230dcb614	Re-introduce double buffer in UpdatePosition, to fix perf regression in gpu_hist (#6757 ) * Revert "gpu_hist performance tweaks (#5707)" This reverts commit `f779980f7e`. * Address reviewer's comment * Fix build error	2021-03-18 13:56:10 -07:00
Jiaming Yuan	4f75f514ce	Fix GPU RF (#6755 ) * Fix sampling.	2021-03-17 06:23:35 +08:00
Jiaming Yuan	1a73a28511	Add device argsort. (#6749 ) This is part of https://github.com/dmlc/xgboost/pull/6747 .	2021-03-16 16:05:22 +08:00
Igor Rukhovich	19a2c54265	Prediction by indices (subsample < 1) (#6683 ) * Another implementation of predicting by indices * Fixed omp parallel_for variable type * Removed SparsePageView from Updater	2021-03-16 15:08:20 +13:00
Philip Hyunsu Cho	366f3cb9d8	Add use_rmm flag to global configuration (#6656 ) * Ensure RMM is 0.18 or later * Add use_rmm flag to global configuration * Modify XGBCachingDeviceAllocatorImpl to skip CUB when use_rmm=True * Update the demo * [CI] Pin NumPy to 1.19.4, since NumPy 1.19.5 doesn't work with latest Shap	2021-03-09 14:53:05 -08:00
Jiaming Yuan	f20074e826	Check for invalid data. (#6742 )	2021-03-04 14:37:20 +08:00
Jiaming Yuan	9da2287ab8	[breaking] Save booster feature info in JSON, remove feature name generation. (#6605 ) * Save feature info in booster in JSON model. * [breaking] Remove automatic feature name generation in `DMatrix`. This PR is to enable reliable feature validation in Python package.	2021-02-25 18:54:16 +08:00
Louis Desreumaux	9b530e5697	Improve OpenMP exception handling (#6680 )	2021-02-25 13:56:16 +08:00
ShvetsKS	9f15b9e322	Optimize CPU prediction (#6696 ) Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>	2021-02-16 14:41:22 +08:00
ShvetsKS	9a0399e898	Removed unnecessary PredictBatch calls (#6700 ) Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>	2021-02-10 20:15:14 +08:00
Jiaming Yuan	e8c5c53e2f	Use `Predictor` for `dart`. (#6693 ) * Use normal predictor for dart booster. * Implement `inplace_predict` for dart. * Enable `dart` for dask interface now that it's thread-safe. * categorical data should be working out of box for dart now. The implementation is not very efficient as it has to pull back the data and apply weight for each tree, but still a significant improvement over previous implementation as now we no longer binary search for each sample. * Fix output prediction shape on dataframe.	2021-02-09 23:30:19 +08:00
Jiaming Yuan	5d48d40d9a	Fix DMatrix slice with feature types. (#6689 )	2021-02-09 08:13:51 +08:00
Jiaming Yuan	4656b09d5d	[breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668 ) * Add a new API function for predicting on `DMatrix`. This function aligns with rest of the `XGBoosterPredictFrom` functions on semantic of function arguments. Purge `ntree_limit` from libxgboost, use iteration instead. * [dask] Use `inplace_predict` by default for dask sklearn models. * [dask] Run prediction shape inference on worker instead of client. The breaking change is in the Python sklearn `apply` function, I made it to be consistent with other prediction functions where `best_iteration` is used by default.	2021-02-08 18:26:32 +08:00
Jiaming Yuan	dbb5208a0a	Use __array_interface__ for creating DMatrix from CSR. (#6675 ) * Use __array_interface__ for creating DMatrix from CSR. * Add configuration.	2021-02-05 21:09:47 +08:00
Jiaming Yuan	1e949110da	Use generic dispatching routine for array interface. (#6672 )	2021-02-05 09:23:38 +08:00
Jiaming Yuan	411592a347	Enhance inplace prediction. (#6653 ) * Accept array interface for csr and array. * Accept an optional proxy dmatrix for metainfo. This constructs an explicit `_ProxyDMatrix` type in Python. * Remove unused doc. * Add strict output.	2021-02-02 11:41:46 +08:00
Jiaming Yuan	a9ec0ea6da	Align device id in predict transform with predictor. (#6662 )	2021-02-02 08:33:29 +08:00
Jiaming Yuan	c3c8e66fc9	Make prediction functions thread safe. (#6648 )	2021-01-28 23:29:43 +08:00
Philip Hyunsu Cho	0f2ed21a9d	[Breaking] Change default evaluation metric for binary:logitraw objective to logloss (#6647 )	2021-01-29 00:12:12 +09:00
Jiaming Yuan	1b70a323a7	Improve string view to reduce string allocation. (#6644 )	2021-01-27 19:08:52 +08:00
Jiaming Yuan	bc08e0c9d1	Remove `experimental_json_serialization` from tests. (#6640 )	2021-01-27 17:44:49 +08:00
Jiaming Yuan	d132933550	Remove type check for solaris. (#6610 )	2021-01-16 02:58:19 +08:00
ShvetsKS	7f4d3a91b9	Multiclass prediction caching for CPU Hist (#6550 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-01-13 04:42:07 +08:00
Jiaming Yuan	f2f7dd87b8	Use view for `SparsePage` exclusively. (#6590 )	2021-01-11 18:04:55 +08:00
Jiaming Yuan	80065d571e	[dask] Add DaskXGBRanker (#6576 ) * Initial support for distributed LTR using dask. * Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.	2021-01-08 18:35:09 +08:00
Gorkem Ozkaya	2231940d1d	Clip small positive values in gamma-nloglik (#6537 ) For the `gamma-nloglik` eval metric, small positive values in the labels are causing `NaN`'s in the outputs, as reported here: https://github.com/dmlc/xgboost/issues/5349. This will add clipping on them, similar to what is done in other metrics like `poisson-nloglik` and `logloss`.	2020-12-22 03:11:40 +08:00
Jiaming Yuan	ca3da55de4	Support early stopping with training continuation, correct num boosted rounds. (#6506 ) * Implement early stopping with training continuation. * Add new C API for obtaining boosted rounds. * Fix off by 1 in `save_best`. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-17 19:59:19 +08:00
Philip Hyunsu Cho	ad1a527709	Enable loading model from <1.0.0 trained with objective='binary:logitraw' (#6517 ) * Enable loading model from <1.0.0 trained with objective='binary:logitraw' * Add binary:logitraw in model compatibility testing suite * Feedback from @trivialfis: Override ProbToMargin() for LogisticRaw Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-12-16 16:53:46 -08:00
Philip Hyunsu Cho	bf6cfe3b99	[Breaking] Upgrade cuDF and RMM to 0.18 nightlies; require RMM 0.18+ for RMM plugin (#6510 ) * [CI] Upgrade cuDF and RMM to 0.18 nightlies * Modify RMM plugin to be compatible with RMM 0.18 * Update src/common/device_helpers.cuh Co-authored-by: Mark Harris <mharris@nvidia.com> Co-authored-by: Mark Harris <mharris@nvidia.com>	2020-12-16 10:07:52 -08:00
Jiaming Yuan	c5876277a8	Drop saving binary format for memory snapshot. (#6513 )	2020-12-17 00:14:57 +08:00
Jiaming Yuan	347f593169	Accept numpy array for DMatrix slice index. (#6368 )	2020-12-16 14:42:52 +08:00
Jiaming Yuan	886486a519	Support categorical data in GPU weighted sketching. (#6508 )	2020-12-16 14:23:28 +08:00
Igor Rukhovich	5c8ccf4455	Improved InitSampling function speed by 2.12 times (#6410 ) * Improved InitSampling function speed by 2.12 times * Added explicit conversion	2020-12-15 20:59:24 -08:00
Philip Hyunsu Cho	c31e3efa7c	Pass correct split_type to GPU predictor (#6491 ) * Pass correct split_type to GPU predictor * Add a test	2020-12-11 19:30:00 -08:00
Philip Hyunsu Cho	fb56da5e8b	Add global configuration (#6414 ) * Add management functions for global configuration: XGBSetGlobalConfig(), XGBGetGlobalConfig(). * Add Python interface: set_config(), get_config(), and config_context(). * Add unit tests for Python * Add R interface: xgb.set.config(), xgb.get.config() * Add unit tests for R Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-12-03 00:05:18 -08:00
Jiaming Yuan	f4ff1c53fd	Fix CLI ranking demo. (#6439 ) Save model at final round.	2020-11-29 03:12:06 +08:00
Honza Sterba	b0036b339b	Optionaly fail when gpu_id is set to invalid value (#6342 )	2020-11-28 15:14:12 +08:00
ShvetsKS	956beead70	Thread local memory allocation for BuildHist (#6358 ) * thread mem locality * fix apply * cleanup * fix lint * fix tests * simple try * fix * fix * apply comments * fix comments * fix * apply simple comment Co-authored-by: ShvetsKS <kirill.shvets@intel.com>	2020-11-25 17:50:12 +03:00
Jiaming Yuan	42d31d9dcb	Fix MPI build. (#6403 )	2020-11-21 13:38:21 +08:00
Jiaming Yuan	44a9d69efb	Small cleanup to evaluator. (#6400 )	2020-11-20 09:33:51 +08:00
ShvetsKS	512b464cfa	Disable HT for DMatrix creation (#6386 ) Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>	2020-11-14 22:18:33 +08:00
Philip Hyunsu Cho	e5193c21a1	[dask] Allow empty data matrix in AFT survival (#6379 ) * [dask] Allow empty data matrix in AFT survival * Add unit test	2020-11-12 17:49:58 -08:00
Jiaming Yuan	d711d648cb	Fix label errors in graph visualization (#6369 )	2020-11-11 17:44:59 -08:00
Jiaming Yuan	8a17610666	Implement GPU predict leaf. (#6187 )	2020-11-11 17:33:47 +08:00
Jiaming Yuan	43efadea2e	Deterministic data partitioning for external memory (#6317 ) * Make external memory data partitioning deterministic. * Change the meaning of `page_size` from bytes to number of rows. * Design a data pool. * Note for external memory. * Enable unity build on Windows CI. * Force garbage collect on test.	2020-11-11 06:11:06 +08:00
ShvetsKS	d411f98d26	simple fix for static shedule in predict (#6357 ) Co-authored-by: ShvetsKS <kirill.shvets@intel.com>	2020-11-09 17:01:30 +08:00
Jiaming Yuan	519cee115a	Avoid resetting seed for every configuration. (#6349 )	2020-11-06 10:28:35 +08:00
Jack Dunn	51e6531315	Fix missing space in warning message (#6340 )	2020-11-04 06:03:16 -05:00

... 4 5 6 7 8 ...

1310 Commits