xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	f1a4a1ac95	[CI] Upgrade build image to CentOS 7 + GCC 8; require CUDA 10.1 and later (#7141 )	2021-07-29 10:54:33 -07:00
graue70	dfdf0b08fc	Fix typo and grammatical mistake in error message (#7134 )	2021-07-28 17:17:05 +08:00
Gil Forsyth	92ae3abc97	[dask] Disallow importing non-dask estimators from xgboost.dask (#7133 ) * Disallow importing non-dask estimators from xgboost.dask This is mostly a style change, but also avoids a user error (that I have committed on a few occasions). Since `XGBRegressor` and `XGBClassifier` are imported as parent classes for the `dask` estimators, without defining an `__all__`, autocomplete (or muscle) memory will produce the following with little prompting: ``` from xgboost.dask import XGBClassifier ``` There's nothing inherently wrong with that, but given that `XGBClassifier` is not `dask` enabled, it can lead to confusing behavior until you figure out you should've typed ``` from xgboost.dask import DaskXGBClassifier ``` Another option is to alias import the existing non-dask estimators. * Remove base/iter class, add train predict funcs	2021-07-28 02:07:23 +08:00
Robert Maynard	1a75f43304	Allow compilation with nvcc 11.4 (#7131 ) * Use type aliases for discard iterators * update to include host_vector as thrust 1.12 doesn't bring it in as a side-effect * cub::DispatchRadixSort requires signed offset types	2021-07-27 20:05:33 +08:00
Jiaming Yuan	7017dd5a26	[JVM-Packages] Use Python tracker in XGBoost for JVM package. (#7132 )	2021-07-27 16:20:42 +08:00
Jiaming Yuan	48d5de80a2	[R] Fix softprob reshape. (#7126 )	2021-07-27 15:25:17 +08:00
Jiaming Yuan	7ee7a95b84	Use upstream URI in distributed quantile tests. (#7129 ) * Use upstream URI in distributed quantile tests. * Fix test cv `PytestAssertRewriteWarning`.	2021-07-27 14:09:49 +08:00
Jiaming Yuan	e88ac9cc54	[dask] Extend tree stats tests. (#7128 ) * Add tests to GPU. * Assert cover in children sums up to the parent.	2021-07-27 12:22:13 +08:00
Jiaming Yuan	778135f657	Fix parameter loading with training continuation. (#7121 ) * Add a demo for training continuation.	2021-07-23 10:51:47 +08:00
Taewoo Kim	41e882f80b	Check input value is duplicated when quantile queue is full (#7091 ) Co-authored-by: Taewoo Kim <taewoo@layer6.com>	2021-07-23 03:07:01 +08:00
ShvetsKS	caa9e527dd	Remove extra sync for dense data (#7120 ) Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>	2021-07-22 19:02:31 +08:00
Jiaming Yuan	e6088366df	Export Python Interface for external memory. (#7070 ) * Add Python iterator interface. * Add tests. * Add demo. * Add documents. * Handle empty dataset.	2021-07-22 15:15:53 +08:00
farfarawayzyt	e64ee6592f	fix typo in src/common/hist.cc BuildHistKernel (#7116 )	2021-07-21 19:53:05 +08:00
naveenkb	9f7f8b976d	[XGBoost4J-Spark] bestIteration and bestScore for early stopping (#7095 )	2021-07-19 18:46:49 +08:00
farfarawayzyt	d7c14496d2	fix typo in arguments of PartitionBuilder::Init (#7113 ) Co-authored-by: Yuntian Zhang <zhangyt@lamda.nju.edu.cn>	2021-07-16 15:46:22 +08:00
Jiaming Yuan	bd1f3a38f0	Rewrite sparse dmatrix using callbacks. (#7092 ) - Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves. - Remove use of threaded iterator and IO queue. - Remove `page_size`. - Make sure the number of pages in memory is bounded. - Make sure the cache can not be violated. - Provide an interface for internal algorithms to process data asynchronously.	2021-07-16 12:33:31 +08:00
Jiaming Yuan	2f524e9f41	[dask] Work around segfault in prediction. (#7112 )	2021-07-16 04:27:05 +08:00
Jiaming Yuan	abec3dbf6d	Fix thread safety of softmax prediction. (#7104 )	2021-07-16 02:06:55 +08:00
Philip Hyunsu Cho	2801d69fb7	[CI] Pin libomp to 11.1.0 (#7107 )	2021-07-15 11:16:51 +08:00
Jiaming Yuan	8e8232fb4c	[CI] Update R cache. (#7102 )	2021-07-14 03:15:35 +08:00
Jiaming Yuan	345796825f	Optional find dependency in installed cmake config. (#7099 ) * Find dependency only when xgboost is built as static library. * Resolve msvc warning. * Add test for linking shared library.	2021-07-11 17:20:55 +08:00
ZabelTech	1d91f71119	fix typo in `XGDMatrixSetFloatInfo` example (#7097 )	2021-07-10 21:40:25 +08:00
Jiaming Yuan	77f6cf2d13	Support hessian in host sketch container. (#7081 ) Prepare for migrating approx onto hist's codebase.	2021-07-08 16:33:58 +08:00
Jiaming Yuan	84d359efb8	Support host data in proxy DMatrix. (#7087 )	2021-07-08 11:35:48 +08:00
Jiaming Yuan	5d7cdf2e36	[Breaking] Rename Quantile DMatrix C API. (#7082 ) The role of ProxyDMatrix is going beyond what it was designed. Now it's used by both QuantileDeviceDMatrix and inplace prediction. After the refactoring of sparse DMatrix it will also be used for external memory. Renaming the C API to extract it from QuantileDeviceDMatrix.	2021-07-08 11:34:14 +08:00
Jiaming Yuan	c766f143ab	Refactor external memory formats. (#7089 ) * Save base_rowid. * Return write size. * Remove unused function.	2021-07-08 04:04:51 +08:00
Jiaming Yuan	689eb8f620	Check external memory support for exact tree method. (#7088 )	2021-07-08 02:12:57 +08:00
Jiaming Yuan	615ab2b03e	Extract evaluate splits from CPU hist. (#7079 ) Other than modularizing the split evaluation function, this PR also removes some more functions including `InitNewNodes` and `BuildNodeStats` among some other unused variables. Also, scattered code like setting leaf weights is grouped into the split evaluator and `NodeEntry` is simplified and made private. Another subtle difference with the original implementation is that the modified code doesn't call `tree[nidx].Parent()` to traversal upward.	2021-07-07 15:16:25 +08:00
Jeff H	d22b293f2f	Update reference to treelite website (#7084 ) treelite.io is no longer a valid site and re-directs users to a parked domain. Re-directing to the documentation is safer at this point.	2021-07-06 22:15:07 -07:00
Jiaming Yuan	f937f514aa	Remove lz4 compression with external memory. (#7076 )	2021-07-06 14:46:43 +08:00
Jiaming Yuan	116d711815	Make `SimpleDMatrix` ctor reusable. (#7075 )	2021-07-06 13:38:24 +08:00
Jiaming Yuan	d7e1fa7664	Fix feature names and types in output model slice. (#7078 )	2021-07-06 11:47:49 +08:00
Jiaming Yuan	ffa66aace0	Persist data in dask test. (#7077 )	2021-07-06 11:47:17 +08:00
Jiaming Yuan	b56d3d5d5c	Fix with latest panda range index. (#7074 )	2021-07-03 16:43:52 +08:00
Jiaming Yuan	93f3acdef9	Fix with latest pylint. (#7071 )	2021-07-02 21:26:00 +08:00
Jiaming Yuan	a5d222fcdb	Handle categorical split in model histogram and dataframe. (#7065 ) * Error on get_split_value_histogram when feature is categorical * Add a category column to output dataframe	2021-07-02 13:10:36 +08:00
Jiaming Yuan	1cd20efe68	Move `GHistIndex` into `DMatrix`. (#7064 )	2021-07-01 00:44:49 +08:00
Jiaming Yuan	1c8fdf2218	Remove use of `device_idx` in `dh::LaunchN`. (#7063 ) It's an unused parameter, removing it can make the CI log more readable.	2021-06-29 11:37:26 +08:00
Philip Hyunsu Cho	dd4db347f3	Fix early stopping behavior with MAPE metric (#7061 )	2021-06-26 03:02:33 +08:00
Jiaming Yuan	8fa32fdda2	Implement categorical data support for SHAP. (#7053 ) * Add CPU implementation. * Update GPUTreeSHAP. * Add GPU implementation by defining custom split condition.	2021-06-25 19:02:46 +08:00
Jiaming Yuan	663136aa08	Implement feature score for linear model. (#7048 ) * Add feature score support for linear model. * Port R interface to the new implementation. * Add linear model support in Python. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-06-25 14:34:02 +08:00
Philip Hyunsu Cho	b2d300e727	[CI] Upgrade to CMake 3.14 (#7060 ) * [CI] Upgrade to CMake 3.14 * Add FATAL_ERROR directive, for users with CMake 2.x	2021-06-24 18:07:24 -07:00
Jiaming Yuan	1d4d345634	Tests for dask skl categorical data support. (#7054 )	2021-06-24 16:33:57 +08:00
Jiaming Yuan	da1ad798ca	Convert numpy float to Python float in feat score. (#7047 )	2021-06-21 20:58:43 +08:00
Jiaming Yuan	bbfffb444d	Fix race condition in CPU shap. (#7050 )	2021-06-21 10:03:15 +08:00
Jiaming Yuan	29f8fd6fee	Support categorical split in tree model dump. (#7036 )	2021-06-18 16:46:20 +08:00
Jiaming Yuan	7968c0d051	Test on s390x. (#7038 ) * Fix && remove unused parameter.	2021-06-18 14:55:08 +08:00
Jiaming Yuan	86715e4cd4	Support categorical data for dask functional interface and DQM. (#7043 ) * Support categorical data for dask functional interface and DQM. * Implement categorical data support for GPU GK-merge. * Add support for dask functional interface. * Add support for DQM. * Get newer cupy.	2021-06-18 13:06:52 +08:00
Jiaming Yuan	7dd29ffd47	Implement feature score in GBTree. (#7041 ) * Categorical data support. * Eliminate text parsing during feature score computation.	2021-06-18 11:53:16 +08:00
Jiaming Yuan	dcd84b3979	[CI] Configure RAPIDS, dask, modin (#7033 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-06-18 10:27:51 +08:00

1 2 3 4 5 ...

5414 Commits