xgboost

Author	SHA1	Message	Date
Jiaming Yuan	689eb8f620	Check external memory support for exact tree method. (#7088 )	2021-07-08 02:12:57 +08:00
Jiaming Yuan	615ab2b03e	Extract evaluate splits from CPU hist. (#7079 ) Other than modularizing the split evaluation function, this PR also removes some more functions including `InitNewNodes` and `BuildNodeStats` among some other unused variables. Also, scattered code like setting leaf weights is grouped into the split evaluator and `NodeEntry` is simplified and made private. Another subtle difference with the original implementation is that the modified code doesn't call `tree[nidx].Parent()` to traversal upward.	2021-07-07 15:16:25 +08:00
Jiaming Yuan	116d711815	Make `SimpleDMatrix` ctor reusable. (#7075 )	2021-07-06 13:38:24 +08:00
Jiaming Yuan	d7e1fa7664	Fix feature names and types in output model slice. (#7078 )	2021-07-06 11:47:49 +08:00
Jiaming Yuan	1cd20efe68	Move `GHistIndex` into `DMatrix`. (#7064 )	2021-07-01 00:44:49 +08:00
Jiaming Yuan	1c8fdf2218	Remove use of `device_idx` in `dh::LaunchN`. (#7063 ) It's an unused parameter, removing it can make the CI log more readable.	2021-06-29 11:37:26 +08:00
Jiaming Yuan	8fa32fdda2	Implement categorical data support for SHAP. (#7053 ) * Add CPU implementation. * Update GPUTreeSHAP. * Add GPU implementation by defining custom split condition.	2021-06-25 19:02:46 +08:00
Jiaming Yuan	663136aa08	Implement feature score for linear model. (#7048 ) * Add feature score support for linear model. * Port R interface to the new implementation. * Add linear model support in Python. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-06-25 14:34:02 +08:00
Jiaming Yuan	bbfffb444d	Fix race condition in CPU shap. (#7050 )	2021-06-21 10:03:15 +08:00
Jiaming Yuan	29f8fd6fee	Support categorical split in tree model dump. (#7036 )	2021-06-18 16:46:20 +08:00
Jiaming Yuan	7968c0d051	Test on s390x. (#7038 ) * Fix && remove unused parameter.	2021-06-18 14:55:08 +08:00
Jiaming Yuan	86715e4cd4	Support categorical data for dask functional interface and DQM. (#7043 ) * Support categorical data for dask functional interface and DQM. * Implement categorical data support for GPU GK-merge. * Add support for dask functional interface. * Add support for DQM. * Get newer cupy.	2021-06-18 13:06:52 +08:00
Jiaming Yuan	7dd29ffd47	Implement feature score in GBTree. (#7041 ) * Categorical data support. * Eliminate text parsing during feature score computation.	2021-06-18 11:53:16 +08:00
Jiaming Yuan	5c2d7a18c9	Parallel model dump for trees. (#7040 )	2021-06-15 14:08:26 +08:00
ShvetsKS	2567404ab6	Simplify sparse and dense CPU hist kernels (#7029 ) * Simplify sparse and dense kernels * Extract row partitioner. Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-06-11 18:26:30 +08:00
Jiaming Yuan	b56614e9b8	[R] Use new predict function. (#6819 ) * Call new C prediction API. * Add `strict_shape`. * Add `iterationrange`. * Update document.	2021-06-11 13:03:29 +08:00
Jiaming Yuan	f79cc4a7a4	Implement categorical prediction for CPU and GPU predict leaf. (#7001 ) * Categorical prediction with CPU predictor and GPU predict leaf. * Implement categorical prediction for CPU prediction. * Implement categorical prediction for GPU predict leaf. * Refactor the prediction functions to have a unified get next node function. Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>	2021-06-11 10:11:45 +08:00
Jiaming Yuan	72f9daf9b6	Fix `gpu_id` with custom objective. (#7015 )	2021-06-09 14:51:17 +08:00
TP Boudreau	bd2ca543c4	Fix BinarySearchBin() argument types (#7026 )	2021-06-08 19:05:46 +08:00
ShvetsKS	5cdaac00c1	Remove feature grouping (#7018 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-06-03 04:35:26 +08:00
ShvetsKS	57c732655e	Merge lossgude and depthwise strategies for CPU hist (#7007 ) * fix java/scala test: max depth is also valid parameter for lossguide Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-06-03 01:49:43 +08:00
Jiaming Yuan	ee4f51a631	Support for all primitive types from array. (#7003 ) * Change C API name. * Test for all primitive types from array. * Add native support for CPU 128 float. * Convert boolean and float16 in Python. * Fix dask version for now.	2021-06-01 08:34:48 +08:00
Jiaming Yuan	816b789bf0	Add predictor to skl constructor. (#7000 )	2021-05-29 04:52:56 +08:00
ShvetsKS	55b823b27d	Reduce 'InitSampling' complexity and set gradients to zero (#6922 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-05-29 04:52:23 +08:00
Jiaming Yuan	4cf95a6041	Support numpy array interface (#6998 )	2021-05-27 16:08:22 +08:00
Jiaming Yuan	86e60e3ba8	Guard against index error in prediction. (#6982 ) * Remove `best_ntree_limit` from documents.	2021-05-25 23:24:59 +08:00
Jiaming Yuan	6e52aefb37	Revert OMP guard. (#6987 ) The guard protects the global variable from being changed by XGBoost. But this leads to a bug that the `n_threads` parameter is no longer used after the first iteration. This is due to the fact that `omp_set_num_threads` is only called once in `Learner::Configure` at the beginning of the training process. The guard is still useful for `gpu_id`, since this is called all the times in our codebase doesn't matter which iteration we are currently running.	2021-05-25 08:56:28 +08:00
Livius	a4886c404a	Fix compilation error on x86 (#6964 ) Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2021-05-14 13:31:49 +08:00
Jiaming Yuan	44cc9c04ea	Fix multiclass auc with empty dataset. (#6947 )	2021-05-12 15:01:14 +08:00
Andrew Ziem	3e7e426b36	Fix spelling in documents (#6948 ) * Update roxygen2 doc. Co-authored-by: fis <jm.yuan@outlook.com>	2021-05-11 20:44:36 +08:00
Jiaming Yuan	37ad60fe25	Enforce input data is not `object`. (#6927 ) * Check for object data type. * Allow strided arrays with greater underlying buffer size.	2021-05-02 00:09:01 +08:00
Jiaming Yuan	8760ec4827	Ensure predict leaf output 1-dim vector where there's only 1 tree. (#6889 )	2021-04-23 15:07:48 +08:00
Jiaming Yuan	a2ecbdaa31	Add an API guard to prevent global variables being changed. (#6891 )	2021-04-23 10:27:57 +08:00
Jiaming Yuan	556a83022d	Implement unified update prediction cache for (gpu_)hist. (#6860 ) * Implement utilites for linalg. * Unify the update prediction cache functions. * Implement update prediction cache for multi-class gpu hist.	2021-04-17 00:29:34 +08:00
Jiaming Yuan	1b26a2a561	Copy output data for argsort. (#6866 ) Fix GPU AUC.	2021-04-16 21:05:01 +08:00
Jiaming Yuan	f294c4e023	Use constexpr in `dh::CopyIf`. (#6828 )	2021-04-08 07:37:47 +08:00
Jiaming Yuan	7bcc8b3e5c	Use batched copy if. (#6826 )	2021-04-06 10:34:04 +08:00
Jiaming Yuan	7e06c81894	Fix approximated predict contribution. (#6811 )	2021-04-03 02:15:03 +08:00
Jiaming Yuan	b1fdb220f4	Remove deprecated `n_gpus` parameter. (#6821 )	2021-04-02 03:02:32 +08:00
Jiaming Yuan	905fdd3e08	Fix typos in AUC. (#6795 )	2021-03-31 16:35:42 +08:00
Jiaming Yuan	3039dd194b	Don't estimate sketch batch size when rmm is used. (#6807 )	2021-03-31 15:29:56 +08:00
Jiaming Yuan	138fe8516a	Remove unnecessary calls to iota. (#6797 )	2021-03-31 15:27:23 +08:00
Jiaming Yuan	79b8b560d2	Optimize dart inplace predict perf. (#6804 )	2021-03-31 15:20:54 +08:00
Jiaming Yuan	a59c7323b4	Fix inplace predict missing value. (#6787 )	2021-03-27 05:36:10 +08:00
ShvetsKS	8825670c9c	Memory consumption fix for row-major adapters (#6779 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com> Co-authored-by: fis <jm.yuan@outlook.com>	2021-03-26 08:44:30 +08:00
Jiaming Yuan	a7083d3c13	Fix dart inplace prediction with GPU input. (#6777 ) * Fix dart inplace predict with data on GPU, which might trigger a fatal check for device access right. * Avoid copying data whenever possible.	2021-03-25 12:00:32 +08:00
Jiaming Yuan	1d90577800	Verify strictly positive labels for gamma regression. (#6778 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-03-25 11:46:52 +08:00
Jiaming Yuan	794fd6a46b	Support v3 cuda array interface. (#6776 )	2021-03-25 09:58:09 +08:00
Jiaming Yuan	bcc0277338	Re-implement ROC-AUC. (#6747 ) * Re-implement ROC-AUC. * Binary * MultiClass * LTR * Add documents. This PR resolves a few issues: - Define a value when the dataset is invalid, which can happen if there's an empty dataset, or when the dataset contains only positive or negative values. - Define ROC-AUC for multi-class classification. - Define weighted average value for distributed setting. - A correct implementation for learning to rank task. Previous implementation is just binary classification with averaging across groups, which doesn't measure ordered learning to rank.	2021-03-20 16:52:40 +08:00
Jiaming Yuan	4ee8340e79	Support column major array. (#6765 )	2021-03-20 05:19:46 +08:00

... 3 4 5 6 7 ...

1310 Commits