xgboost

Author	SHA1	Message	Date
Jiaming Yuan	972730cde0	Use matrix for gradient. (#9508 ) - Use the `linalg::Matrix` for storing gradients. - New API for the custom objective. - Custom objective for multi-class/multi-target is now required to return the correct shape. - Custom objective for Python can accept arrays with any strides. (row-major, column-major)	2023-08-24 05:29:52 +08:00
Jiaming Yuan	acc110c251	[MT-TREE] Support prediction cache and model slicing. (#8968 ) - Fix prediction range. - Support prediction cache in mt-hist. - Support model slicing. - Make the booster a Python iterable by defining `__iter__`. - Cleanup removed/deprecated parameters. - A new field in the output model `iteration_indptr` for pointing to the ranges of trees for each iteration.	2023-03-27 23:10:54 +08:00
Jiaming Yuan	6deaec8027	Pass obj info by reference instead of by value. (#8889 ) - Pass obj info into tree updater as const pointer. This way we don't have to initialize the learner model param before configuring gbm, hence breaking up the dependency of configurations.	2023-03-11 01:38:28 +08:00
Jiaming Yuan	228a46e8ad	Support learning rate for zero-hessian objectives. (#8866 )	2023-03-06 20:33:28 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Rory Mitchell	90cce38236	Remove single_precision_histogram for gpu_hist (#7828 )	2022-05-03 14:53:19 +02:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Haoming Chen	55463b76c1	Initialize TreeUpdater ctx_ with nullptr (#7722 )	2022-03-10 22:33:32 +08:00
Jiaming Yuan	5d7818e75d	Remove `omp_get_max_threads` in tree updaters. (#7590 )	2022-01-26 19:55:47 +08:00
Jiaming Yuan	b06040b6d0	Implement a general array view. (#7365 ) * Replace existing matrix and vector view. This is to prepare for handling higher dimension data and prediction when we support multi-target models.	2021-11-05 04:16:11 +08:00
Jiaming Yuan	4100827971	Pass infomation about objective to tree methods. (#7385 ) * Define the `ObjInfo` and pass it down to every tree updater.	2021-11-04 01:52:44 +08:00
Jiaming Yuan	556a83022d	Implement unified update prediction cache for (gpu_)hist. (#6860 ) * Implement utilites for linalg. * Unify the update prediction cache functions. * Implement update prediction cache for multi-class gpu hist.	2021-04-17 00:29:34 +08:00
ShvetsKS	7f4d3a91b9	Multiclass prediction caching for CPU Hist (#6550 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-01-13 04:42:07 +08:00
vcarpani	6bc9747df5	Reduce compile warnings (#6198 ) Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-08 23:14:59 +08:00
Philip Hyunsu Cho	1d22a9be1c	Revert "Reorder includes. (#5749 )" (#5771 ) This reverts commit d3a0efbf162f3dceaaf684109e1178c150b32de3.	2020-06-09 10:29:28 -07:00
Jiaming Yuan	d3a0efbf16	Reorder includes. (#5749 ) * Reorder includes. * R.	2020-06-03 17:30:47 +12:00
Jiaming Yuan	0012f2ef93	Upgrade clang-tidy on CI. (#5469 ) * Correct all clang-tidy errors. * Upgrade clang-tidy to 10 on CI. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-04-05 04:42:29 +08:00
Jiaming Yuan	ab7a46a1a4	Check whether current updater can modify a tree. (#5406 ) * Check whether current updater can modify a tree. * Fix tree model JSON IO for pruned trees.	2020-03-14 09:24:08 +08:00
Jiaming Yuan	0110754a76	Remove update prediction cache from predictors. (#5312 ) Move this function into gbtree, and uses only updater for doing so. As now the predictor knows exactly how many trees to predict, there's no need for it to update the prediction cache.	2020-02-17 11:35:47 +08:00
Jiaming Yuan	e089e16e3d	Pass pointer to model parameters. (#5101 ) * Pass pointer to model parameters. This PR de-duplicates most of the model parameters except the one in `tree_model.h`. One difficulty is `base_score` is a model property but can be changed at runtime by objective function. Hence when performing model IO, we need to save the one provided by users, instead of the one transformed by objective. Here we created an immutable version of `LearnerModelParam` that represents the value of model parameter after configuration.	2019-12-10 12:11:22 +08:00
Jiaming Yuan	7ef5b78003	Implement JSON IO for updaters (#5094 ) * Implement JSON IO for updaters. * Remove parameters in split evaluator.	2019-12-07 00:24:00 +08:00
Jiaming Yuan	095de3bf5f	Export c++ headers in CMake installation. (#4897 ) * Move get transpose into cc. * Clean up headers in host device vector, remove thrust dependency. * Move span and host device vector into public. * Install c++ headers. * Short notes for c and c++. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-10-06 23:53:09 -04:00
Jiaming Yuan	f0064c07ab	Refactor configuration [Part II]. (#4577 ) * Refactor configuration [Part II]. * General changes: Remove `Init` methods to avoid ambiguity. Remove `Configure(std::map<>)` to avoid redundant copying and prepare for parameter validation. (`std::vector` is returned from `InitAllowUnknown`). ** Add name to tree updaters for easier debugging. * Learner changes: Make `LearnerImpl` the only source of configuration. All configurations are stored and carried out by `LearnerImpl::Configure()`. Remove booster in C API. Originally kept for "compatibility reason", but did not state why. So here we just remove it. Add a `metric_names_` field in `LearnerImpl`. Remove `LazyInit`. Configuration will always be lazy. ** Run `Configure` before every iteration. * Predictor changes: Allocate both cpu and gpu predictor. Remove cpu_predictor from gpu_predictor. `GBTree` is now used to dispatch the predictor. ** Remove some GPU Predictor tests. * IO No IO changes. The binary model format stability is tested by comparing hashing value of save models between two commits	2019-07-20 08:34:56 -04:00
Jiaming Yuan	c589eff941	De-duplicate GPU parameters. (#4454 ) * Only define `gpu_id` and `n_gpus` in `LearnerTrainParam` * Pass LearnerTrainParam through XGBoost vid factory method. * Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively). * Test learner train param io. * Fix gpu pickling.	2019-05-29 11:55:57 +08:00
Rory Mitchell	ccf80703ef	Clang-tidy static analysis (#3222 ) * Clang-tidy static analysis * Modernise checks * Google coding standard checks * Identifier renaming according to Google style	2018-04-19 18:57:13 +12:00
Andrew V. Adinetz	d5992dd881	Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. (#3116 ) * Replaced std::vector-based interfaces with HostDeviceVector-based interfaces. - replacement was performed in the learner, boosters, predictors, updaters, and objective functions - only interfaces used in training were replaced; interfaces like PredictInstance() still use std::vector - refactoring necessary for replacement of interfaces was also performed, such as using HostDeviceVector in prediction cache * HostDeviceVector-based interfaces for custom objective function example plugin.	2018-02-28 13:00:04 +13:00
Thejaswi	84ab74f3a5	Objective function evaluation on GPU with minimal PCIe transfers (#2935 ) * Added GPU objective function and no-copy interface. - xgboost::HostDeviceVector<T> syncs automatically between host and device - no-copy interfaces have been added - default implementations just sync the data to host and call the implementations with std::vector - GPU objective function, predictor, histogram updater process data directly on GPU	2018-01-12 21:33:39 +13:00
Sergei Lebedev	e5e721722e	Fix compilation on OS X with GCC 7 (#2256 ) * Fix compilation on OS X with GCC 7 Compilation failed with In file included from src/tree/tree_updater.cc:6:0: include/xgboost/tree_updater.h:75:46: error: 'function' is not a member of 'std' std::function<TreeUpdater* ()> > { caused by a missing <functional> include. * Fixed another occurence of that issue spotted by @ClimberPG	2017-05-19 22:04:07 -07:00
Rory Mitchell	6bf968efe6	[GPU Plugin] Fast histogram speed improvements. Updated benchmarks. (#2258 )	2017-05-08 09:21:38 -07:00
Philip Cho	14fba01b5a	Improve multi-threaded performance (#2104 ) * Add UpdatePredictionCache() option to updaters Some updaters (e.g. fast_hist) has enough information to quickly compute prediction cache for the training data. Each updater may override UpdaterPredictionCache() method to update the prediction cache. Note: this trick does not apply to validation data. * Respond to code review * Disable some debug messages by default * Document UpdatePredictionCache() interface * Remove base_margin logic from UpdatePredictionCache() implementation * Do not take pointer to cfg, as reference may get stale * Improve multi-threaded performance * Use columnwise accessor to accelerate ApplySplit() step, with support for a compressed representation * Parallel sort for evaluation step * Inline BuildHist() function * Cache gradient pairs when building histograms in BuildHist() * Add missing #if macro * Respond to code review * Use wrapper to enable parallel sort on Linux * Fix C++ compatibility issues * MSVC doesn't support unsigned in OpenMP loops * gcc 4.6 doesn't support using keyword * Fix lint issues * Respond to code review * Fix bug in ApplySplitSparseData() * Attempting to read beyond the end of a sparse column * Mishandling the case where an entire range of rows have missing values * Fix training continuation bug Disable UpdatePredictionCache() in the first iteration. This way, we can accomodate the scenario where we build off of an existing (nonempty) ensemble. * Add regression test for fast_hist * Respond to code review * Add back old version of ApplySplitSparseData	2017-03-25 10:35:01 -07:00
Tianqi Chen	df38f251be	Fix warnings from g++5 or higher (#1510 )	2016-08-26 16:14:10 -07:00
RAMitchell	93196eb811	cmake build system (#1314 ) * Changed c api to compile under MSVC * Include functional.h header for MSVC * Add cmake build	2016-07-02 19:07:35 -07:00
tqchen	634db18a0f	[TRAVIS] cleanup travis script	2016-01-16 10:25:12 -08:00
tqchen	d75e3ed05d	[LIBXGBOOST] pass demo running.	2016-01-16 10:24:01 -08:00
tqchen	0d95e863c9	[LEARNER] refactor learner	2016-01-16 10:24:01 -08:00
tqchen	e4567bbc47	[REFACTOR] Add alias, allow missing variables, init gbm interface	2016-01-16 10:24:01 -08:00
tqchen	d4677b6561	[TREE] finish move of updater	2016-01-16 10:24:01 -08:00
tqchen	c8ccb61b9e	[TREE] Enable updater registry	2016-01-16 10:24:01 -08:00

38 Commits