xgboost

Author	SHA1	Message	Date
Jiaming Yuan	81210420c6	Remove `omp_get_max_threads` (#7608 ) This is the one last PR for removing omp global variable. * Add context object to the `DMatrix`. This bridges `DMatrix` with https://github.com/dmlc/xgboost/issues/7308 . * Require context to be available at the construction time of booster. * Add `n_threads` support for R csc DMatrix constructor. * Remove `omp_get_max_threads` in R glue code. * Remove threading utilities that rely on omp global variable.	2022-01-28 16:09:22 +08:00
Jiaming Yuan	58a6723eb1	Initial support for multioutput regression. (#7514 ) * Add num target model parameter, which is configured from input labels. * Change elementwise metric and indexing for weights. * Add demo. * Add tests.	2021-12-18 09:28:38 +08:00
Jiaming Yuan	5b1161bb64	Convert labels into tensor. (#7456 ) * Add a new ctor to tensor for `initilizer_list`. * Change labels from host device vector to tensor. * Rename the field from `labels_` to `labels` since it's a public member.	2021-12-17 00:58:35 +08:00
Jiaming Yuan	bd1f3a38f0	Rewrite sparse dmatrix using callbacks. (#7092 ) - Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves. - Remove use of threaded iterator and IO queue. - Remove `page_size`. - Make sure the number of pages in memory is bounded. - Make sure the cache can not be violated. - Provide an interface for internal algorithms to process data asynchronously.	2021-07-16 12:33:31 +08:00
Jiaming Yuan	556a83022d	Implement unified update prediction cache for (gpu_)hist. (#6860 ) * Implement utilites for linalg. * Unify the update prediction cache functions. * Implement update prediction cache for multi-class gpu hist.	2021-04-17 00:29:34 +08:00
Jiaming Yuan	f6fe15d11f	Improve parameter validation (#6769 ) * Add quotes to unused parameters. * Check for whitespace.	2021-03-20 01:56:55 +08:00
Jiaming Yuan	9da2287ab8	[breaking] Save booster feature info in JSON, remove feature name generation. (#6605 ) * Save feature info in booster in JSON model. * [breaking] Remove automatic feature name generation in `DMatrix`. This PR is to enable reliable feature validation in Python package.	2021-02-25 18:54:16 +08:00
Jiaming Yuan	4656b09d5d	[breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668 ) * Add a new API function for predicting on `DMatrix`. This function aligns with rest of the `XGBoosterPredictFrom` functions on semantic of function arguments. Purge `ntree_limit` from libxgboost, use iteration instead. * [dask] Use `inplace_predict` by default for dask sklearn models. * [dask] Run prediction shape inference on worker instead of client. The breaking change is in the Python sklearn `apply` function, I made it to be consistent with other prediction functions where `best_iteration` is used by default.	2021-02-08 18:26:32 +08:00
Jiaming Yuan	c3c8e66fc9	Make prediction functions thread safe. (#6648 )	2021-01-28 23:29:43 +08:00
Jiaming Yuan	519cee115a	Avoid resetting seed for every configuration. (#6349 )	2020-11-06 10:28:35 +08:00
Jiaming Yuan	2cc9662005	Support slicing tree model (#6302 ) This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone. * Implement the save_best option in early stopping. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-11-02 23:27:39 -08:00
Jiaming Yuan	9c6e791e64	Enforce tree order in JSON. (#5974 ) * Make JSON model IO more future proof by using tree id in model loading.	2020-08-05 16:44:52 +08:00
boxdot	d268a2a463	Thread-safe prediction by making the prediction cache thread-local. (#5853 ) Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-07-30 12:33:50 +08:00
Jiaming Yuan	21ed1f0c6d	Support 64bit seed. (#5643 )	2020-05-07 14:52:38 +08:00
Jiaming Yuan	6671b42dd4	Use ellpack for prediction only when sparsepage doesn't exist. (#5504 )	2020-04-10 12:15:46 +08:00
Jiaming Yuan	0012f2ef93	Upgrade clang-tidy on CI. (#5469 ) * Correct all clang-tidy errors. * Upgrade clang-tidy to 10 on CI. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-04-05 04:42:29 +08:00
Jiaming Yuan	4942da64ae	Refactor tests with data generator. (#5439 )	2020-03-27 06:44:44 +08:00
Rory Mitchell	b0ed3f0a66	Remove unnecessary DMatrix methods (#5324 )	2020-02-25 12:40:39 +13:00
Jiaming Yuan	911a902835	Merge model compatibility fixes from 1.0rc branch. (#5305 ) * Port test model compatibility. * Port logit model fix. https://github.com/dmlc/xgboost/pull/5248 https://github.com/dmlc/xgboost/pull/5281	2020-02-13 20:41:58 +08:00
Jiaming Yuan	29eeea709a	Pass shared pointer instead of raw pointer to Learner. (#5302 ) Extracted from https://github.com/dmlc/xgboost/pull/5220 .	2020-02-11 14:16:38 +08:00
Jiaming Yuan	3eb1279bbf	Config for linear updaters. (#5222 )	2020-01-25 11:26:46 +08:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Jiaming Yuan	ebc86a3afa	Disable parameter validation for Scikit-Learn interface. (#5167 ) * Disable parameter validation for now. Scikit-Learn passes all parameters down to XGBoost, whether they are used or not. * Add option `validate_parameters`.	2020-01-07 11:17:31 +08:00
Jiaming Yuan	f3d7877802	Parameter validation (#5157 ) * Unused code. * Split up old colmaker parameters from train param. * Fix dart. * Better name.	2019-12-26 11:59:05 +08:00
Jiaming Yuan	ad4a1c732c	Small refinements for JSON model. (#5112 ) * Naming consistency. * Remove duplicated test.	2019-12-11 19:49:01 +08:00
Jiaming Yuan	208ab3b1ff	Model IO in JSON. (#5110 )	2019-12-11 11:20:40 +08:00
Jiaming Yuan	f24be2efb4	Use configure_file() to configure version only (#4974 ) * Avoid writing build_config.h * Remove build_config.h all together. * Lint.	2019-10-22 23:47:00 -07:00
Jiaming Yuan	a5f232feb8	Fix calling GPU predictor (#4836 ) * Fix calling GPU predictor	2019-09-05 19:09:38 -04:00
Rong Ou	38ab79f889	Make HostDeviceVector single gpu only (#4773 ) * Make HostDeviceVector single gpu only	2019-08-26 09:51:13 +12:00
Rong Ou	c5b229632d	[BREAKING] prevent multi-gpu usage (#4749 ) * prevent multi-gpu usage * fix distributed test * combine gpu predictor tests * set upper bound on n_gpus	2019-08-13 09:11:35 +12:00
Bobby	3e2c472944	Fix model parameter recovery (#4738 )	2019-08-07 02:32:10 -04:00
Rong Ou	851b5b3808	Remove gpu_exact tree method (#4742 )	2019-08-07 11:43:20 +12:00
Jiaming Yuan	4fe0d8203e	Specify version macro in CMake. (#4730 ) * Specify version macro in CMake. * Use `XGBOOST_DEFINITIONS` instead.	2019-08-04 06:04:04 -04:00
Jiaming Yuan	f0064c07ab	Refactor configuration [Part II]. (#4577 ) * Refactor configuration [Part II]. * General changes: Remove `Init` methods to avoid ambiguity. Remove `Configure(std::map<>)` to avoid redundant copying and prepare for parameter validation. (`std::vector` is returned from `InitAllowUnknown`). ** Add name to tree updaters for easier debugging. * Learner changes: Make `LearnerImpl` the only source of configuration. All configurations are stored and carried out by `LearnerImpl::Configure()`. Remove booster in C API. Originally kept for "compatibility reason", but did not state why. So here we just remove it. Add a `metric_names_` field in `LearnerImpl`. Remove `LazyInit`. Configuration will always be lazy. ** Run `Configure` before every iteration. * Predictor changes: Allocate both cpu and gpu predictor. Remove cpu_predictor from gpu_predictor. `GBTree` is now used to dispatch the predictor. ** Remove some GPU Predictor tests. * IO No IO changes. The binary model format stability is tested by comparing hashing value of save models between two commits	2019-07-20 08:34:56 -04:00
Jiaming Yuan	c5719cc457	Offload some configurations into GBM. (#4553 ) This is part 1 of refactoring configuration. * Move tree heuristic configurations. * Split up declarations and definitions for GBTree. * Implement UseGPU in gbm.	2019-06-14 09:18:51 +08:00
Jiaming Yuan	c589eff941	De-duplicate GPU parameters. (#4454 ) * Only define `gpu_id` and `n_gpus` in `LearnerTrainParam` * Pass LearnerTrainParam through XGBoost vid factory method. * Disable all GPU usage when GPU related parameters are not specified (fixes XGBoost choosing GPU over aggressively). * Test learner train param io. * Fix gpu pickling.	2019-05-29 11:55:57 +08:00
Jiaming Yuan	7b9043cf71	Fix clang-tidy warnings. (#4149 ) * Upgrade gtest for clang-tidy. * Use CMake to install GTest instead of mv. * Don't enforce clang-tidy to return 0 due to errors in thrust. * Add a small test for tidy itself. * Reformat.	2019-03-13 02:25:51 +08:00
Jiaming Yuan	7ea5675679	Add PushCSC for SparsePage. (#4193 ) * Add PushCSC for SparsePage. * Move Push* definitions into cc file. * Add std:: prefix to `size_t` make clang++ happy. * Address monitor count == 0.	2019-03-02 01:58:08 +08:00
Jiaming Yuan	754fe8142b	Make `HistCutMatrix::Init' be aware of groups. (#4115 ) * Add checks for group size. * Simple docs. * Search group index during hist cut matrix initialization. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-02-16 04:39:41 +08:00
Jiaming Yuan	be948df23f	Fix ignoring dart in updater configuration. (#4024 ) * Fix ignoring dart in updater configuration.	2018-12-26 18:24:45 +08:00
Jiaming Yuan	19ee0a3579	Refactor fast-hist, add tests for some updaters. (#3836 ) Add unittest for prune. Add unittest for refresh. Refactor fast_hist. * Remove fast_hist_param. * Rename to quantile_hist. Add unittests for QuantileHist. * Refactor QuantileHist into .h and .cc file. * Remove sync.h. * Remove MGPU_mock test. Rename fast hist method to quantile hist.	2018-11-07 21:15:07 +13:00
Philip Hyunsu Cho	91537e7353	Fix #3342 and h2oai/h2o4gpu#625 : Save predictor parameters in model file (#3856 ) * Fix #3342 and h2oai/h2o4gpu#625: Save predictor parameters in model file This allows pickled models to retain predictor attributes, such as 'predictor' (whether to use CPU or GPU) and 'n_gpu' (number of GPUs to use). Related: h2oai/h2o4gpu#625 Closes #3342. TODO. Write a test. * Fix lint * Do not load GPU predictor into CPU-only XGBoost * Add a test for pickling GPU predictors * Make sample data big enough to pass multi GPU test * Update test_gpu_predictor.cu	2018-11-03 21:45:38 -07:00
Philip Hyunsu Cho	ad68865d6b	[Blocking] Fix #3840 : Clean up logic for parsing tree_method parameter (#3849 ) * Clean up logic for converting tree_method to updater sequence * Use C++11 enum class for extra safety Compiler will give warnings if switch statements don't handle all possible values of C++11 enum class. Also allow enum class to be used as DMLC parameter. * Fix compiler error + lint * Address reviewer comment * Better docstring for DECLARE_FIELD_ENUM_CLASS * Fix lint * Add C++ test to see if tree_method is recognized * Fix clang-tidy error * Add test_learner.h to R package * Update comments * Fix lint error	2018-11-01 19:33:35 -07:00
trivialfis	cf2d86a4f6	Add travis sanitizers tests. (#3557 ) * Add travis sanitizers tests. * Add gcc-7 in Travis. * Add SANITIZER_PATH for CMake. * Enable sanitizer tests in Travis. * Fix memory leaks in tests. * Fix all memory leaks reported by Address Sanitizer. * tests/cpp/helpers.h/CreateDMatrix now returns raw pointer.	2018-08-19 16:40:30 +12:00
Rory Mitchell	ef23e424f1	[GPU-Plugin] Add GPU accelerated prediction (#2593 ) * [GPU-Plugin] Add GPU accelerated prediction * Improve allocation message * Update documentation * Resolve linker error for predictor * Add unit tests	2017-08-16 12:31:59 +12:00

46 Commits