xgboost

Author	SHA1	Message	Date
Jiaming Yuan	8ca9744b07	Use `scikit-learn` in extra dependencies. (#5310 )	2020-02-15 07:12:51 +08:00
Jiaming Yuan	911a902835	Merge model compatibility fixes from 1.0rc branch. (#5305 ) * Port test model compatibility. * Port logit model fix. https://github.com/dmlc/xgboost/pull/5248 https://github.com/dmlc/xgboost/pull/5281	2020-02-13 20:41:58 +08:00
Jiaming Yuan	84e395d91e	Fix CMake build on Windows with setuptools. (#5280 )	2020-02-05 10:47:39 +08:00
Jiaming Yuan	595a00466d	Rewrite setup.py. (#5271 ) The setup.py is rewritten. This new script uses only Python code and provide customized implementation of setuptools commands. This way users can run most of setuptools commands just like any other Python libraries. * Remove setup_pip.py * Remove soft links. * Define customized commands. * Remove shell script. * Remove makefile script. * Update the doc for building from source.	2020-02-04 13:35:42 +08:00
Philip Hyunsu Cho	c74216f22c	Declare Python 3.8 support in setup.py (#5274 )	2020-02-03 10:38:52 -08:00
David Díaz Vico	71e7e3b96f	Improved sklearn compatibility (#5255 )	2020-02-03 13:30:45 +08:00
Jiaming Yuan	a5cc112eea	Export JSON config in `get_params`. (#5256 )	2020-02-03 12:46:51 +08:00
Jiaming Yuan	472ded549d	Save Scikit-Learn attributes into learner attributes. (#5245 ) * Remove the recommendation for pickle. * Save skl attributes in booster.attr * Test loading scikit-learn model with native booster.	2020-01-30 16:00:18 +08:00
Philip Hyunsu Cho	4240daed4e	Make `pip install xgboost.tar.gz` work by fixing build-python.sh (#5241 ) Make pip install xgboost.tar.gz work by fixing build-python.sh Simplify install doc * Add test * Install Miniconda for Linux target too * Build XGBoost only once in sdist * Try importing xgboost after installation * Don't set PYTHONPATH env var for sdist test	2020-01-28 23:18:23 -08:00
Jiaming Yuan	40680368cf	Add constraint parameters to Scikit-Learn interface. (#5227 ) * Add document for constraints. * Fix a format error in doc for objective function.	2020-01-25 11:12:02 +08:00
OrdoAbChao	b4f952bd22	[Breaking] Remove Scikit-Learn default parameters (#5130 ) * Simplify Scikit-Learn parameter management. * Copy base class for removing duplicated parameter signatures. * Set all parameters to None. * Handle None in set_param. * Extract the doc. Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-23 20:25:20 +08:00
Jiaming Yuan	1891cc766d	Fix metainfo from DataFrame. (#5216 ) * Fix metainfo from DataFrame. * Unify helper functions for data and meta.	2020-01-22 16:29:44 +08:00
Rory Mitchell	5d4c24a1fc	Fix cupy without cudf import (#5219 )	2020-01-22 18:02:39 +13:00
Rory Mitchell	9c56480c61	Support dmatrix construction from cupy array (#5206 )	2020-01-22 13:15:27 +13:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	ebc86a3afa	Disable parameter validation for Scikit-Learn interface. (#5167 ) * Disable parameter validation for now. Scikit-Learn passes all parameters down to XGBoost, whether they are used or not. * Add option `validate_parameters`.	2020-01-07 11:17:31 +08:00
K.O	018df6004e	Fix feature_name crated from int64index dataframe. (#5081 )	2019-12-30 12:26:22 +08:00
Jiaming Yuan	6848d0426f	Clean up Python 2 compatibility code. (#5161 )	2019-12-27 18:34:53 +08:00
Jiaming Yuan	298ebe68ac	[Breaking] Remove `learning_rates` in Python. (#5155 ) * Remove `learning_rates`. It's been deprecated since we have callback. * Set `before_iteration` of `reset_learning_rate` to False to preserve the initial learning rate, and comply to the term "reset". Closes #4709. * Tests for various `tree_method`.	2019-12-24 14:25:48 +08:00
Jiaming Yuan	0202e04a8e	Add base margin to sklearn interface. (#5151 )	2019-12-24 09:43:41 +08:00
Jiaming Yuan	a4b929385e	Note for `DaskDMatrix`. (#5144 ) * Brief introduction to `DaskDMatrix`. * Add xgboost.dask.train to API doc	2019-12-23 18:55:32 +08:00
Jiaming Yuan	3136185bc5	JSON configuration IO. (#5111 ) * Add saving/loading JSON configuration. * Implement Python pickle interface with new IO routines. * Basic tests for training continuation.	2019-12-15 17:31:53 +08:00
yage	dcde433402	Fix MacOS build error. (#5080 ) * Update build script. * Update build doc.	2019-12-04 19:34:35 +08:00
Jiaming Yuan	761e938dbe	Support dask dataframe as y for classifier. (#5077 ) * Support dask dataframe as y for classifier. * Lint.	2019-12-02 11:53:30 +08:00
Jiaming Yuan	a4f5c86276	Allow using RandomState object from Numpy in sklearn interface. (#5049 )	2019-11-19 10:56:39 +08:00
Philip Hyunsu Cho	4d2779663e	Require Python 3.5+ in setup.py (#5021 )	2019-11-18 18:55:58 -08:00
Jiaming Yuan	98b051269b	Assert dask client at early stage. (#5048 )	2019-11-19 10:55:26 +08:00
mitama	374648c21a	Add better error message for invalid feature names (#5024 )	2019-11-10 01:58:14 +08:00
Jiaming Yuan	7663de956c	Run training with empty DMatrix. (#4990 ) This makes GPU Hist robust in distributed environment as some workers might not be associated with any data in either training or evaluation. * Disable rabit mock test for now: See #5012 . * Disable dask-cudf test at prediction for now: See #5003 * Launch dask job for all workers despite they might not have any data. * Check 0 rows in elementwise evaluation metrics. Using AUC and AUC-PR still throws an error. See #4663 for a robust fix. * Add tests for edge cases. * Add `LaunchKernel` wrapper handling zero sized grid. * Move some parts of allreducer into a cu file. * Don't validate feature names when the booster is empty. * Sync number of columns in DMatrix. As num_feature is required to be the same across all workers in data split mode. * Filtering in dask interface now by default syncs all booster that's not empty, instead of using rank 0. * Fix Jenkins' GPU tests. * Install dask-cuda from source in Jenkins' test. Now all tests are actually running. * Restore GPU Hist tree synchronization test. * Check UUID of running devices. The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field. * Fix CMake policy and project variables. Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13. * Fix copying data to CPU * Fix race condition in cpu predictor. * Fix duplicated DMatrix construction. * Don't download extra nccl in CI script.	2019-11-06 16:13:13 +08:00
Jiaming Yuan	ac457c56a2	Use `UpdateAllowUnknown' for non-model related parameter. (#4961 ) * Use `UpdateAllowUnknown' for non-model related parameter. Model parameter can not pack an additional boolean value due to binary IO format. This commit deals only with non-model related parameter configuration. * Add tidy command line arg for use-dmlc-gtest.	2019-10-23 05:50:12 -04:00
Philip Hyunsu Cho	741fbf47c4	[CI] Update lint configuration to support latest pylint convention (#4971 ) * Update lint configuration * Use gcc 8 consistently in build instruction	2019-10-21 16:40:57 -07:00
Jacob Kim	a78d4e7aa8	Follow PEP 257 -- Docstring Conventions (#4959 )	2019-10-17 23:45:25 -04:00
Jiaming Yuan	185e3f1916	Update GPU doc. (#4953 )	2019-10-16 05:54:09 -04:00
Jiaming Yuan	7e72a12871	Don't `set_params` at the end of `set_state`. (#4947 ) * Don't set_params at the end of set_state. * Also fix another issue found in dask prediction. * Add note about prediction. Don't support other prediction modes at the moment.	2019-10-15 10:08:26 -04:00
Jiaming Yuan	2ebdec8aa6	Fix dask prediction. (#4941 ) * Fix dask prediction. * Add better error messages for wrong partition.	2019-10-14 23:19:34 -04:00
Peter Badida	a9053aff83	Fix incorrectly displayed Note in the doc (#4943 )	2019-10-14 03:45:23 -04:00
Jiaming Yuan	4bbf062ed3	[Breaking] Update sklearn interface. (#4929 ) * Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909. * Check data shape. Fix #4896. * Check element of eval_set is tuple. Fix #4875 * Add doc for random_state with hogwild. Fixes #4919	2019-10-12 02:50:09 -04:00
Jiaming Yuan	6c9b6f11da	Use `cudf.concat` explicitly. (#4918 ) * Use `cudf.concat` explicitly. * Add test.	2019-10-10 16:02:10 +13:00
Rory Mitchell	aefb1e5c2f	Resolve dask performance issues (#4914 ) * Set dask client.map as impure function * Remove nrows * Remove slow check in verbose mode	2019-10-10 16:01:30 +13:00
Jiaming Yuan	d30e63a0a5	Support feature names/types for cudf. (#4902 ) * Implement most of the pandas procedure for cudf except for type conversion. * Requires an array of interfaces in metainfo.	2019-09-29 15:07:51 -04:00
Vibhu Jawa	2fa8b359e0	Add support for cudf.Series (#4891 )	2019-09-25 23:52:28 -04:00
Jiaming Yuan	b8433c455a	Rewrite Dask interface. (#4819 )	2019-09-25 01:30:14 -04:00
Jiaming Yuan	c7416002e9	Fix DMatrix doc. (#4884 )	2019-09-23 01:55:04 -04:00
Jiaming Yuan	d669ea1eaa	Deprecate set group (#4864 ) * Convert jvm package and R package. * Restore for compatibility.	2019-09-17 21:26:54 -04:00
Jiaming Yuan	5374f52531	Complete cudf support. (#4850 ) * Handles missing value. * Accept all floating point and integer types. * Move to cudf 9.0 API. * Remove requirement on `null_count`. * Arbitrary column types support.	2019-09-16 23:52:00 -04:00
Cyprien Ricque	830e73901d	eval_metrics print fixed (#4803 )	2019-08-28 23:52:18 -04:00
Jiaming Yuan	9700776597	Cudf support. (#4745 ) * Initial support for cudf integration. * Add two C APIs for consuming data and metainfo. * Add CopyFrom for SimpleCSRSource as a generic function to consume the data. * Add FromDeviceColumnar for consuming device data. * Add new MetaInfo::SetInfo for consuming label, weight etc.	2019-08-19 16:51:40 +12:00
Evan Kepner	53d4272c2a	add os.PathLike support for file paths to DMatrix and Booster Python classes (#4757 )	2019-08-15 04:46:25 -04:00
Xu Xiao	97eece6ea0	[python package] include dmlc-tracker into xgb python pkg (#4731 )	2019-08-05 12:21:07 -04:00
Jiaming Yuan	e930a8e54f	Remove old Python trouble shooting doc. [skip ci] (#4729 )	2019-08-03 12:51:29 -04:00

1 2 3 4 5 ...

372 Commits