xgboost

Author	SHA1	Message	Date
Jiaming Yuan	80065d571e	[dask] Add DaskXGBRanker (#6576 ) * Initial support for distributed LTR using dask. * Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.	2021-01-08 18:35:09 +08:00
Jiaming Yuan	7c9dcbedbc	Fix `best_ntree_limit` for dart and gblinear. (#6579 )	2021-01-08 10:05:39 +08:00
Jiaming Yuan	f5ff90cd87	Support `_estimator_type`. (#6582 ) * Use `_estimator_type`. For more info, see: https://scikit-learn.org/stable/developers/develop.html#estimator-types * Model trained from dask can be loaded by single node skl interface.	2021-01-08 10:01:16 +08:00
Jiaming Yuan	60cfd14349	[dask, sklearn] Fix predict proba. (#6566 ) * For sklearn: - Handles user defined objective function. - Handles `softmax`. * For dask: - Use the implementation from sklearn, the previous implementation doesn't perform any extra handling.	2021-01-05 08:29:06 +08:00
Jiaming Yuan	516a93d25c	Fix `best_ntree_limit`. (#6569 )	2021-01-03 05:58:54 +08:00
James Lamb	195a41cef1	[python-package] remove unnecessary files to reduce sdist size (fixes #6560 ) (#6565 )	2021-01-02 15:56:39 +08:00
Philip Hyunsu Cho	fa13992264	Calling XGBModel.fit() should clear the Booster by default (#6562 ) * Calling XGBModel.fit() should clear the Booster by default * Document the behavior of fit() * Allow sklearn object to be passed in directly via xgb_model argument * Fix lint	2020-12-31 11:02:08 -08:00
Jiaming Yuan	de8fd852a5	[dask] Add type hints. (#6519 ) * Add validate_features. * Show type hints in doc. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-29 19:41:02 +08:00
Jiaming Yuan	610ee632cc	[Breaking] Rename `data` to `X` in `predict_proba`. (#6555 ) New Scikit-Learn version uses keyword argument, and `X` is the predefined keyword. * Use pip to install latest Python graphviz on Windows CI.	2020-12-28 21:36:03 +08:00
Philip Hyunsu Cho	fbb980d9d3	Expand `~` into the home directory on Linux and MacOS (#6531 )	2020-12-19 23:35:13 -08:00
Philip Hyunsu Cho	380f6f4ab8	Remove cupy.array_equal, since it's not compatible with cuPy 7.8 (#6528 )	2020-12-18 09:16:52 -08:00
Jiaming Yuan	ca3da55de4	Support early stopping with training continuation, correct num boosted rounds. (#6506 ) * Implement early stopping with training continuation. * Add new C API for obtaining boosted rounds. * Fix off by 1 in `save_best`. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-17 19:59:19 +08:00
Philip Hyunsu Cho	125b3c0f2d	Lazy import cuDF and Dask (#6522 ) * Lazy import cuDF * Lazy import Dask Co-authored-by: PSEUDOTENSOR / Jonathan McKinney <pseudotensor@gmail.com> * Fix lint Co-authored-by: PSEUDOTENSOR / Jonathan McKinney <pseudotensor@gmail.com>	2020-12-17 01:51:35 -08:00
Jiaming Yuan	d8d684538c	[CI] Split up main.yml, add mypy. (#6515 )	2020-12-17 00:15:44 +08:00
Jiaming Yuan	0e97d97d50	Fix merge conflict. (#6512 )	2020-12-16 18:02:25 +08:00
Jiaming Yuan	347f593169	Accept numpy array for DMatrix slice index. (#6368 )	2020-12-16 14:42:52 +08:00
Jiaming Yuan	ef4a0e0aac	Fix DMatrix feature names/types IO. (#6507 ) * Fix feature names/types IO Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-16 14:24:27 +08:00
Jiaming Yuan	3c3f026ec1	Move metric configuration into booster. (#6504 )	2020-12-16 05:35:04 +08:00
Jiaming Yuan	d45c0d843b	Show partition status in dask error. (#6366 )	2020-12-16 02:58:21 +08:00
ShvetsKS	8139849ab6	Fix handling of print period in EvaluationMonitor (#6499 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2020-12-15 19:20:19 +08:00
Jiaming Yuan	a30461cf87	[dask] Support all parameters in regressor and classifier. (#6471 ) * Add eval_metric. * Add callback. * Add feature weights. * Add custom objective.	2020-12-14 07:35:56 +08:00
Philip Hyunsu Cho	0d483cb7c1	Bump version to 1.4.0 snapshot in master (#6486 )	2020-12-10 07:38:08 -08:00
Jiaming Yuan	0ffaf0f5be	Fix dask ip resolution. (#6475 ) This adopts the solution used in dask/dask-xgboost#40 which employs the get_host_ip from dmlc-core tracker.	2020-12-07 16:36:23 -08:00
Jiaming Yuan	47b86180f6	Don't validate feature when number of rows is 0. (#6472 )	2020-12-07 18:08:51 +08:00
Jiaming Yuan	703c2d06aa	Fix global config default value. (#6470 )	2020-12-06 06:15:33 +08:00
Jiaming Yuan	d6386e45e8	Fix filtering callable objects in skl xgb param. (#6466 ) Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-05 17:20:36 +08:00
Philip Hyunsu Cho	c103ec51d8	Enforce row-major order in cuPy array (#6459 )	2020-12-03 18:29:10 -08:00
Philip Hyunsu Cho	4f70e14031	Fix docstring of config.py to use correct versionadded (#6458 )	2020-12-03 10:41:53 -08:00
Philip Hyunsu Cho	fb56da5e8b	Add global configuration (#6414 ) * Add management functions for global configuration: XGBSetGlobalConfig(), XGBGetGlobalConfig(). * Add Python interface: set_config(), get_config(), and config_context(). * Add unit tests for Python * Add R interface: xgb.set.config(), xgb.get.config() * Add unit tests for R Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-12-03 00:05:18 -08:00
Jiaming Yuan	927c316aeb	Fix period in evaluation monitor. (#6441 )	2020-11-29 03:18:33 +08:00
Jiaming Yuan	2ce2a1a4d8	[SKL] Propagate parameters to booster during set_param. (#6416 )	2020-11-20 20:37:35 +08:00
Jiaming Yuan	a7b42adb74	Fix dask predict (#6412 )	2020-11-20 10:10:52 +08:00
Jiaming Yuan	3ac173fc8b	Fix typo. (#6399 )	2020-11-16 16:59:12 -08:00
Nikhil Choudhary	ae1662028a	Fixed few grammatical mistakes in doc (#6393 )	2020-11-15 13:48:08 +08:00
Jiaming Yuan	fcd6fad822	[dask] Small cleanup. (#6391 )	2020-11-14 22:15:05 +08:00
Jiaming Yuan	4ccf92ea34	[dask] Fix union of workers. (#6375 )	2020-11-13 16:55:05 +08:00
Jiaming Yuan	fcfeb4959c	Deprecate positional arguments. (#6365 ) Deprecate positional arguments in following functions: - `__init__` for all classes in sklearn module. - `fit` method for all classes in sklearn module. - dask interface. - `set_info` for `DMatrix` class. Refactor the evaluation matrices handling.	2020-11-13 11:10:30 +08:00
Jiaming Yuan	c90f968d92	Update Python documents. (#6376 )	2020-11-12 17:51:32 +08:00
Jiaming Yuan	6e12c2a6f8	[dask] Supoort running on GKE. (#6343 ) * Avoid accessing `scheduler_info()['workers']`. * Avoid calling `client.gather` inside task. * Avoid using `client.scheduler_address`.	2020-11-11 18:04:34 +08:00
Jiaming Yuan	e65e3cf36e	Support shared library in system path. (#6362 )	2020-11-10 16:04:25 +08:00
Jiaming Yuan	184e2eac7d	Add period to evaluation monitor. (#6348 )	2020-11-10 07:47:48 +08:00
Jiaming Yuan	2cc9662005	Support slicing tree model (#6302 ) This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone. * Implement the save_best option in early stopping. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-11-02 23:27:39 -08:00
Rory Mitchell	29745c6df2	Fix inclusive scan for large sizes (#6234 )	2020-11-03 17:01:43 +13:00
Jiaming Yuan	7756192906	[dask] Fix prediction on `DaskDMatrix` with multiple meta data. (#6333 ) * Unify the meta handling methods.	2020-11-02 19:18:44 -05:00
Jiaming Yuan	6ff331b705	Fix Python callback. (#6320 )	2020-10-30 05:03:44 +08:00
Jiaming Yuan	74ea82209b	Lazy import dask libraries. (#6309 ) * Lazy import dask libraries. * Lint && fix. * Use short name.	2020-10-28 15:50:11 -07:00
Jiaming Yuan	e8884c4637	Document tree method for feature weights. (#6312 )	2020-10-28 13:42:13 -07:00
Jiaming Yuan	b180223d18	Cleanup RABIT. (#6290 ) * Remove recovery and MPI speed tests. * Remove readme. * Remove Python binding. * Add checks in C API.	2020-10-27 08:48:22 +08:00
Philip Hyunsu Cho	c8ec62103a	Deprecate LabelEncoder in XGBClassifier; Enable cuDF/cuPy inputs in XGBClassifier (#6269 ) * Deprecate LabelEncoder in XGBClassifier; skip LabelEncoder for cuDF/cuPy inputs * Add unit tests for cuDF and cuPy inputs with XGBClassifier * Fix lint * Clarify warning * Move use_label_encoder option to XGBClassifier constructor * Add a test for cudf.Series * Add use_label_encoder to XGBRFClassifier doc * Address reviewer feedback	2020-10-26 13:20:51 -07:00
Jiaming Yuan	d61b628bf5	Remove RABIT CMake targets. (#6275 ) * Now it's built as part of libxgboost. * Set correct C API error in RABIT initialization and finalization. * Remove redundant message. * Guard the tracker print C API.	2020-10-27 01:30:20 +08:00

1 2 3 4 5 ...

505 Commits