xgboost

Author	SHA1	Message	Date
Jiaming Yuan	38dd91f491	Save model in ubj as the default. (#9947 )	2024-01-05 17:53:36 +08:00
Jiaming Yuan	9f73127a23	Cleanup Python GPU tests. (#9934 ) * Cleanup Python GPU tests. - Remove the use of `gpu_hist` and `gpu_id` in cudf/cupy tests. - Move base margin test into the testing directory.	2024-01-04 13:15:18 +08:00
Jiaming Yuan	faf0f2df10	Support dataframe data format in native XGBoost. (#9828 ) - Implement a columnar adapter. - Refactor Python pandas handling code to avoid converting into a single numpy array. - Add support in R for transforming columns. - Support R data.frame and factor type.	2023-12-12 09:56:31 +08:00
Rong Ou	6fbe6248f4	More in-memory input support for column split (#9685 )	2023-10-20 16:02:36 +08:00
Jiaming Yuan	9027686cac	Support pandas 2.1.0. (#9557 )	2023-09-11 17:44:51 +08:00
Jiaming Yuan	1f9a57d17b	[Breaking] Require format to be specified in input URI. (#9077 ) Previously, we use `libsvm` as default when format is not specified. However, the dmlc data parser is not particularly robust against errors, and the most common type of error is undefined format. Along with which, we will recommend users to use other data loader instead. We will continue the maintenance of the parsers as it's currently used for many internal tests including federated learning.	2023-04-28 19:45:15 +08:00
Jiaming Yuan	2c8d735cb3	Fix tests with pandas 2.0. (#9014 ) * Fix tests with pandas 2.0. - `is_categorical` is replaced by `is_categorical_dtype`. - one hot encoding returns boolean type instead of integer type.	2023-04-11 00:17:34 +08:00
Jiaming Yuan	6a892ce281	Specify src path for isort. (#8867 )	2023-03-06 17:30:27 +08:00
Jiaming Yuan	1325ba9251	Support primitive types of pyarrow-backed pandas dataframe. (#8653 ) Categorical data (dictionary) is not supported at the moment.	2023-01-30 17:53:29 +08:00
James Lamb	96e6b6beba	[ci] remove unused imports in tests (#8707 )	2023-01-25 14:10:29 +08:00
Jiaming Yuan	e68a152d9e	Do not return internal value for `get_params`. (#8634 )	2023-01-05 17:48:26 +08:00
Jiaming Yuan	f6effa1734	Support `Series` and Python primitives in `inplace_predict` and QDM (#8547 )	2022-12-17 00:15:15 +08:00
Jiaming Yuan	d666ba775e	Support all pandas nullable integer types. (#8480 ) - Enumerate all pandas integer types. - Tests for `None`, `nan`, and `pd.NA`	2022-11-28 22:38:16 +08:00
Jiaming Yuan	cf70864fa3	Move Python testing utilities into xgboost module. (#8379 ) - Add typehints. - Fixes for pylint. Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-26 16:56:11 +08:00
Jiaming Yuan	fcab51aa82	Support more pandas nullable types (#8262 ) - Float32/64 - Category.	2022-09-27 01:59:50 +08:00
Jiaming Yuan	332380479b	Avoid warning in np primitive type tests. (#7833 )	2022-04-23 02:07:01 +08:00
Jiaming Yuan	9150fdbd4d	Support pandas nullable types. (#7760 )	2022-03-30 08:51:52 +08:00
Jiaming Yuan	24789429fd	Support latest pandas Index type. (#7595 )	2022-01-26 18:20:10 +08:00
Jiaming Yuan	5b1161bb64	Convert labels into tensor. (#7456 ) * Add a new ctor to tensor for `initilizer_list`. * Change labels from host device vector to tensor. * Rename the field from `labels_` to `labels` since it's a public member.	2021-12-17 00:58:35 +08:00
Jiaming Yuan	a13321148a	Support multi-class with base margin. (#7381 ) This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support for most of the data types along with tests.	2021-11-02 13:38:00 +08:00
Jiaming Yuan	ac9bfaa4f2	Handle missing values in dataframe with category dtype. (#7331 ) * Replace -1 in pandas initializer. * Unify `IsValid` functor. * Mimic pandas data handling in cuDF glue code. * Check invalid categories. * Fix DDM sketching.	2021-10-28 03:33:54 +08:00
Jiaming Yuan	22d56cebf1	Encode pandas categorical data automatically. (#7231 )	2021-09-17 11:09:55 +08:00
Jiaming Yuan	0ed979b096	Support more input types for categorical data. (#7220 ) * Support more input types for categorical data. * Shorten the type name from "categorical" to "c". * Tests for np/cp array and scipy csr/csc/coo. * Specify the type for feature info.	2021-09-16 20:39:30 +08:00
Jiaming Yuan	5d48d40d9a	Fix DMatrix slice with feature types. (#6689 )	2021-02-09 08:13:51 +08:00
Philip Hyunsu Cho	9c9070aea2	Use pytest conventions consistently (#6337 ) * Do not derive from unittest.TestCase (not needed for pytest) * assertRaises -> pytest.raises * Simplify test_empty_dmatrix with test parametrization * setUpClass -> setup_class, tearDownClass -> teardown_class * Don't import unittest; import pytest * Use plain assert * Use parametrized tests in more places * Fix test_gpu_with_sklearn.py * Put back run_empty_dmatrix_reg / run_empty_dmatrix_cls * Fix test_eta_decay_gpu_hist * Add parametrized tests for monotone constraints * Fix test names * Remove test parametrization * Revise test_slice to be not flaky	2020-11-19 17:00:15 -08:00
Christian Lorentzen	cf4f019ed6	[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183 ) * Change DefaultEvalMetric of classification from error to logloss * Change default binary metric in plugin/example/custom_obj.cc * Set old error metric in python tests * Set old error metric in R tests * Fix missed eval metrics and typos in R tests * Fix setting eval_metric twice in R tests * Add warning for empty eval_metric for classification * Fix Dask tests Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-02 12:06:47 -07:00
Jiaming Yuan	7622b8cdb8	Enable categorical data support on Python DMatrix. (#6166 ) * Only pandas is recognized.	2020-09-29 11:22:56 +08:00
Jiaming Yuan	b5f52f0b1b	Validate weights are positive values. (#6115 )	2020-09-15 09:03:55 +08:00
Jiaming Yuan	029a8b533f	Simplify the data backends. (#5893 )	2020-07-16 15:17:31 +08:00
Jiaming Yuan	5af8161a1a	Implement Python data handler. (#5689 ) * Define data handlers for DMatrix. * Throw ValueError in scikit learn interface.	2020-05-22 11:53:55 +08:00
Jiaming Yuan	abca9908ba	Support pandas SparseArray. (#5431 )	2020-03-20 21:40:22 +08:00
Jiaming Yuan	1891cc766d	Fix metainfo from DataFrame. (#5216 ) * Fix metainfo from DataFrame. * Unify helper functions for data and meta.	2020-01-22 16:29:44 +08:00
K.O	018df6004e	Fix feature_name crated from int64index dataframe. (#5081 )	2019-12-30 12:26:22 +08:00
Rong Ou	2c61f02add	fix broken python test (#4395 )	2019-04-23 16:01:23 -07:00
Jiaming Yuan	29a1356669	Deprecate `reg:linear' in favor of` reg:squarederror'. (#4267 ) * Deprecate `reg:linear' in favor of `reg:squarederror'. * Replace the use of `reg:linear'. * Replace the use of `silent`.	2019-03-17 17:55:04 +08:00
Jiaming Yuan	2ea0f887c1	Refactor Python tests. (#3897 ) * Deprecate nose tests. * Format python tests.	2018-11-15 13:56:33 +13:00
Icyblade Dai	0e85b30fdd	Fix issue 2670 (#2671 ) * fix issue 2670 * add python<3.6 compatibility * fix Index * fix Index/MultiIndex * fix lint * fix W0622 really nonsense * fix lambda * Trigger Travis * add test for MultiIndex * remove tailing whitespace	2017-09-19 15:49:41 -04:00
Yuan (Terry) Tang	a64fd74421	Fix wrong expected feature types (#1646 )	2016-10-08 21:16:29 -07:00
tqchen	149589c583	[PYTHON] Refactor trainnig API to use callback	2016-05-19 21:31:23 -07:00
sinhrks	9da2f3e613	DOC/TST: Fix Python sklearn dep	2016-05-01 17:27:43 +09:00
sinhrks	8fc2456c87	Enable flake8	2016-04-24 17:32:31 +09:00
terrytangyuan	803a6fe474	Separate dependencies and lightweight test env for Python	2016-02-28 20:11:10 -06:00

42 Commits