xgboost

Author	SHA1	Message	Date
Jiaming Yuan	e0509b3307	Fix pruner. (#5335 ) * Honor the tree depth. * Prevent pruning pruned node.	2020-02-25 08:32:46 +08:00
Rory Mitchell	b0ed3f0a66	Remove unnecessary DMatrix methods (#5324 )	2020-02-25 12:40:39 +13:00
Jiaming Yuan	655cf17b60	Predict on Ellpack. (#5327 ) * Unify GPU prediction node. * Add `PageExists`. * Dispatch prediction on input data for GPU Predictor.	2020-02-23 06:27:03 +08:00
Philip Hyunsu Cho	cfae247231	Fix a small typo in sklearn.py that broke multiple eval metrics (#5341 )	2020-02-22 19:02:37 +08:00
Philip Hyunsu Cho	7ac7e8778f	Port patches from 1.0.0 branch (#5336 ) * Remove f-string, since it's not supported by Python 3.5 (#5330) * Remove f-string, since it's not supported by Python 3.5 * Add Python 3.5 to CI, to ensure compatibility * Remove duplicated matplotlib * Show deprecation notice for Python 3.5 * Fix lint * Fix lint * Fix a unit test that mistook MINOR ver for PATCH ver * Enforce only major version in JSON model schema * Bump version to 1.1.0-SNAPSHOT	2020-02-21 13:13:21 -08:00
Rory Mitchell	bc96ceb8b2	Refactor SparsePageSource, delete cache files after use (#5321 ) * Refactor sparse page source * Delete temporary cache files * Log fatal if cache exists * Log fatal if multiple threads used with prefetcher	2020-02-19 16:43:41 +13:00
Rory Mitchell	b2b2c4e231	Remove SimpleCSRSource (#5315 )	2020-02-18 16:49:17 +13:00
Jiaming Yuan	0110754a76	Remove update prediction cache from predictors. (#5312 ) Move this function into gbtree, and uses only updater for doing so. As now the predictor knows exactly how many trees to predict, there's no need for it to update the prediction cache.	2020-02-17 11:35:47 +08:00
Jiaming Yuan	e433a379e4	Fix changing locale. (#5314 ) * Fix changing locale. * Don't use locale guard. As number parsing is implemented in house, we don't need locale. * Update doc.	2020-02-17 11:31:13 +08:00
Jiaming Yuan	c35cdecddd	Move prediction cache to Learner. (#5220 ) * Move prediction cache into Learner. * Clean-ups - Remove duplicated cache in Learner and GBM. - Remove ad-hoc fix of invalid cache. - Remove `PredictFromCache` in predictors. - Remove prediction cache for linear altogether, as it's only moving the prediction into training process but doesn't provide any actual overall speed gain. - The cache is now unique to Learner, which means the ownership is no longer shared by any other components. * Changes - Add version to prediction cache. - Use weak ptr to check expired DMatrix. - Pass shared pointer instead of raw pointer.	2020-02-14 13:04:23 +08:00
Rory Mitchell	24ad9dec0b	Testing hist_util (#5251 ) * Rank tests * Remove categorical split specialisation * Extend tests to multiple features, switch to WQSketch * Add tests for SparseCuts * Add external memory quantile tests, fix some existing tests	2020-02-14 14:36:43 +13:00
Jiaming Yuan	911a902835	Merge model compatibility fixes from 1.0rc branch. (#5305 ) * Port test model compatibility. * Port logit model fix. https://github.com/dmlc/xgboost/pull/5248 https://github.com/dmlc/xgboost/pull/5281	2020-02-13 20:41:58 +08:00
Jiaming Yuan	29eeea709a	Pass shared pointer instead of raw pointer to Learner. (#5302 ) Extracted from https://github.com/dmlc/xgboost/pull/5220 .	2020-02-11 14:16:38 +08:00
Jiaming Yuan	595a00466d	Rewrite setup.py. (#5271 ) The setup.py is rewritten. This new script uses only Python code and provide customized implementation of setuptools commands. This way users can run most of setuptools commands just like any other Python libraries. * Remove setup_pip.py * Remove soft links. * Define customized commands. * Remove shell script. * Remove makefile script. * Update the doc for building from source.	2020-02-04 13:35:42 +08:00
Rong Ou	e4b74c4d22	Gradient based sampling for GPU Hist (#5093 ) * Implement gradient based sampling for GPU Hist tree method. * Add samplers and handle compacted page in GPU Hist.	2020-02-04 10:31:27 +08:00
Jiaming Yuan	a5cc112eea	Export JSON config in `get_params`. (#5256 )	2020-02-03 12:46:51 +08:00
Jiaming Yuan	ed0216642f	Avoid dask test fixtures. (#5270 ) * Fix Travis OSX timeout. * Fix classifier.	2020-02-03 12:39:20 +08:00
Jiaming Yuan	fe8d72b50b	Cleanup warnings. (#5247 ) From clang-tidy-9 and gcc-7: Invalid case style, narrowing definition, wrong initialization order, unused variables.	2020-01-31 14:52:15 +08:00
Jiaming Yuan	472ded549d	Save Scikit-Learn attributes into learner attributes. (#5245 ) * Remove the recommendation for pickle. * Save skl attributes in booster.attr * Test loading scikit-learn model with native booster.	2020-01-30 16:00:18 +08:00
Egor Smirnov	c67163250e	Optimized BuildHist function (#5156 )	2020-01-29 23:32:57 -08:00
Philip Hyunsu Cho	4240daed4e	Make `pip install xgboost.tar.gz` work by fixing build-python.sh (#5241 ) Make pip install xgboost.tar.gz work by fixing build-python.sh Simplify install doc * Add test * Install Miniconda for Linux target too * Build XGBoost only once in sdist * Try importing xgboost after installation * Don't set PYTHONPATH env var for sdist test	2020-01-28 23:18:23 -08:00
Jiaming Yuan	ef19480eda	Add dart to JSON schema. (#5218 ) * Add dart to JSON schema. * Use spaces instead of tab.	2020-01-28 13:29:09 +08:00
Rory Mitchell	1b3947d929	Make some GPU tests deterministic (#5229 )	2020-01-26 11:53:07 +13:00
Jiaming Yuan	3eb1279bbf	Config for linear updaters. (#5222 )	2020-01-25 11:26:46 +08:00
Jiaming Yuan	40680368cf	Add constraint parameters to Scikit-Learn interface. (#5227 ) * Add document for constraints. * Fix a format error in doc for objective function.	2020-01-25 11:12:02 +08:00
Philip Hyunsu Cho	44469a0ca9	Extensible binary serialization format for DMatrix::MetaInfo (#5187 ) * Turn xgboost::DataType into C++11 enum class * New binary serialization format for DMatrix::MetaInfo * Fix clang-tidy * Fix c++ test * Implement new format proposal * Move helper functions to anonymous namespace; remove unneeded field * Fix lint * Add shape. * Keep only roundtrip test. * Fix test. * various fixes * Update data.cc Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-23 11:33:17 -08:00
OrdoAbChao	b4f952bd22	[Breaking] Remove Scikit-Learn default parameters (#5130 ) * Simplify Scikit-Learn parameter management. * Copy base class for removing duplicated parameter signatures. * Set all parameters to None. * Handle None in set_param. * Extract the doc. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-23 20:25:20 +08:00
Rory Mitchell	aa9a68010b	uint not supported in cudf (#5225 )	2020-01-23 16:59:18 +13:00
Jiaming Yuan	1891cc766d	Fix metainfo from DataFrame. (#5216 ) * Fix metainfo from DataFrame. * Unify helper functions for data and meta.	2020-01-22 16:29:44 +08:00
Rory Mitchell	9c56480c61	Support dmatrix construction from cupy array (#5206 )	2020-01-22 13:15:27 +13:00
Philip Hyunsu Cho	0184f2e9f7	Explicitly use UTF-8 codepage when using MSVC (#5197 ) * Explicitly use UTF-8 codepage when using MSVC * Fix build with CUDA enabled	2020-01-14 13:30:34 -08:00
Rory Mitchell	a73e25e15f	Implement slice via adapters (#5198 )	2020-01-14 12:55:41 +13:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Rory Mitchell	8cbcc53ccb	Remove old cudf constructor code (#5194 )	2020-01-10 16:35:23 +13:00
Rory Mitchell	87ebfc1315	Implement cudf construction with adapters. (#5189 )	2020-01-09 20:23:06 +13:00
Jiaming Yuan	ee287808fb	Lazy initialization of device vector. (#5173 ) * Lazy initialization of device vector. * Fix #5162. * Disable copy constructor of HostDeviceVector. Prevents implicit copying. * Fix CPU build. * Bring back move assignment operator.	2020-01-07 11:23:05 +08:00
Jiaming Yuan	ebc86a3afa	Disable parameter validation for Scikit-Learn interface. (#5167 ) * Disable parameter validation for now. Scikit-Learn passes all parameters down to XGBoost, whether they are used or not. * Add option `validate_parameters`.	2020-01-07 11:17:31 +08:00
Egor Smirnov	7b17e76c5b	Optimized EvaluateSplut function (#5138 ) * Add block based threading utilities.	2019-12-31 18:18:42 +08:00
Jiaming Yuan	04db125699	Quick fix for memory leak in CPU Hist. (#5153 ) Closes https://github.com/dmlc/xgboost/issues/3579 . * Don't use map.	2019-12-31 14:05:53 +08:00
K.O	018df6004e	Fix feature_name crated from int64index dataframe. (#5081 )	2019-12-30 12:26:22 +08:00
Jiaming Yuan	6848d0426f	Clean up Python 2 compatibility code. (#5161 )	2019-12-27 18:34:53 +08:00
Jiaming Yuan	61286c6e8f	Fix wrapping GPU ID and prevent data copying. (#5160 ) * Removed some data copying. * Make sure gpu_id is valid before any configuration is carried out.	2019-12-27 16:51:08 +08:00
sriramch	ee81ba8e1f	implementation of map ranking algorithm on gpu (#5129 ) * - implementation of map ranking algorithm - also effected necessary suggestions mentioned in the earlier ranking pr's - made some performance improvements to the ndcg algo as well	2019-12-27 12:05:37 +13:00
Philip Hyunsu Cho	9b0af6e882	Enable OpenMP with Apple Clang (Mac default compiler) (#5146 ) * Add OpenMP as CMake target * Require CMake 3.12, to allow linking OpenMP target to objxgboost * Specify OpenMP compiler flag for CUDA host compiler * Require CMake 3.16+ if the OS is Mac OSX * Use AppleClang in Mac tests. * Update dmlc-core	2019-12-26 16:53:12 +08:00
Jiaming Yuan	f3d7877802	Parameter validation (#5157 ) * Unused code. * Split up old colmaker parameters from train param. * Fix dart. * Better name.	2019-12-26 11:59:05 +08:00
Jiaming Yuan	ced3660f60	Tests for empty dmatrix. (#5159 )	2019-12-26 11:51:54 +08:00
Jiaming Yuan	298ebe68ac	[Breaking] Remove `learning_rates` in Python. (#5155 ) * Remove `learning_rates`. It's been deprecated since we have callback. * Set `before_iteration` of `reset_learning_rate` to False to preserve the initial learning rate, and comply to the term "reset". Closes #4709. * Tests for various `tree_method`.	2019-12-24 14:25:48 +08:00
Jiaming Yuan	0202e04a8e	Add base margin to sklearn interface. (#5151 )	2019-12-24 09:43:41 +08:00
Jiaming Yuan	1d0ca49761	Example JSON model parser and Schema. (#5137 )	2019-12-23 19:47:35 +08:00

... 3 4 5 6 7 ...

669 Commits