xgboost

Author	SHA1	Message	Date
Jiaming Yuan	7bc56fa0ed	Use simple print in tracker print function. (#6609 )	2021-01-21 21:15:43 +08:00
Jiaming Yuan	26982f9fce	Skip unused CMake argument in setup.py (#6611 )	2021-01-21 17:25:33 +08:00
Jiaming Yuan	f0fd7629ae	Add helper script and doc for releasing pip package. (#6613 ) * Fix `long_description_content_type`.	2021-01-21 14:46:52 +08:00
Bobby Wang	9d2832a3a3	fix potential TaskFailedListener's callback won't be called (#6612 ) there is possibility that onJobStart of TaskFailedListener won't be called, if the job is submitted before the other thread adds addSparkListener. detail can be found at https://github.com/dmlc/xgboost/pull/6019#issuecomment-760937628	2021-01-21 14:20:32 +08:00
Jiaming Yuan	f8bb678c67	Exclude dmlc test on github action. (#6625 )	2021-01-20 18:50:20 +08:00
Jiaming Yuan	d6d72de339	Revert ntree limit fix (#6616 ) The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.	2021-01-19 23:51:16 +08:00
Jiaming Yuan	d132933550	Remove type check for solaris. (#6610 )	2021-01-16 02:58:19 +08:00
Jiaming Yuan	d356b7a071	Restore unknown data support. (#6595 )	2021-01-14 04:51:16 +08:00
Jiaming Yuan	89a00a5866	[dask] Random forest estimators (#6602 )	2021-01-13 20:59:20 +08:00
Jiaming Yuan	0027220aa0	[breaking] Remove duplicated predict functions, Fix attributes IO. (#6593 ) * Fix attributes not being restored. * Rename all `data` to `X`. [breaking]	2021-01-13 16:56:49 +08:00
ShvetsKS	7f4d3a91b9	Multiclass prediction caching for CPU Hist (#6550 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-01-13 04:42:07 +08:00
Jiaming Yuan	03cd087da1	Remove duplicated DMatrix. (#6592 )	2021-01-12 09:36:56 +08:00
Jiaming Yuan	c709f2aaaf	Fix evaluation result for XGBRanker. (#6594 ) * Remove duplicated code, which fixes typo `evals_result` -> `evals_result_`.	2021-01-12 09:36:41 +08:00
Jiaming Yuan	f2f7dd87b8	Use view for `SparsePage` exclusively. (#6590 )	2021-01-11 18:04:55 +08:00
Jiaming Yuan	78f2cd83d7	Suppress hypothesis health check for dask client. (#6589 )	2021-01-11 14:11:57 +08:00
Jiaming Yuan	80065d571e	[dask] Add DaskXGBRanker (#6576 ) * Initial support for distributed LTR using dask. * Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.	2021-01-08 18:35:09 +08:00
Jiaming Yuan	96d3d32265	[dask] Add shap tests. (#6575 )	2021-01-08 14:59:27 +08:00
Jiaming Yuan	7c9dcbedbc	Fix `best_ntree_limit` for dart and gblinear. (#6579 )	2021-01-08 10:05:39 +08:00
Jiaming Yuan	f5ff90cd87	Support `_estimator_type`. (#6582 ) * Use `_estimator_type`. For more info, see: https://scikit-learn.org/stable/developers/develop.html#estimator-types * Model trained from dask can be loaded by single node skl interface.	2021-01-08 10:01:16 +08:00
Jiaming Yuan	8747885a8b	Support Solaris. (#6578 ) * Add system header. * Remove use of TR1 on Solaris Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2021-01-07 09:05:05 +08:00
TP Boudreau	b2246ae7ef	Update dmlc-core submodule and conform to new API (#6431 ) * Update dmlc-core submodule and conform to new API * Remove unsupported parameter from method signature * Update dmlc-core submodule and conform to new API * Update dmlc-core Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2021-01-05 16:12:22 -08:00
Jiaming Yuan	60cfd14349	[dask, sklearn] Fix predict proba. (#6566 ) * For sklearn: - Handles user defined objective function. - Handles `softmax`. * For dask: - Use the implementation from sklearn, the previous implementation doesn't perform any extra handling.	2021-01-05 08:29:06 +08:00
Jiaming Yuan	516a93d25c	Fix `best_ntree_limit`. (#6569 )	2021-01-03 05:58:54 +08:00
James Lamb	195a41cef1	[python-package] remove unnecessary files to reduce sdist size (fixes #6560 ) (#6565 )	2021-01-02 15:56:39 +08:00
Jiaming Yuan	2b049b32e9	Document various tree methods. (#6564 )	2021-01-02 15:40:46 +08:00
Philip Hyunsu Cho	fa13992264	Calling XGBModel.fit() should clear the Booster by default (#6562 ) * Calling XGBModel.fit() should clear the Booster by default * Document the behavior of fit() * Allow sklearn object to be passed in directly via xgb_model argument * Fix lint	2020-12-31 11:02:08 -08:00
Jiaming Yuan	5e9e525223	Remove warnings in tests. (#6554 )	2020-12-31 13:41:18 +08:00
James Lamb	8ad22bf4e7	Add credentials to .gitignore (#6559 )	2020-12-30 15:58:14 -08:00
Jiaming Yuan	de8fd852a5	[dask] Add type hints. (#6519 ) * Add validate_features. * Show type hints in doc. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-29 19:41:02 +08:00
Jiaming Yuan	610ee632cc	[Breaking] Rename `data` to `X` in `predict_proba`. (#6555 ) New Scikit-Learn version uses keyword argument, and `X` is the predefined keyword. * Use pip to install latest Python graphviz on Windows CI.	2020-12-28 21:36:03 +08:00
Jiaming Yuan	cb207a355d	Add script for generating release tarball. (#6544 )	2020-12-23 16:08:10 +08:00
Gorkem Ozkaya	2231940d1d	Clip small positive values in gamma-nloglik (#6537 ) For the `gamma-nloglik` eval metric, small positive values in the labels are causing `NaN`'s in the outputs, as reported here: https://github.com/dmlc/xgboost/issues/5349. This will add clipping on them, similar to what is done in other metrics like `poisson-nloglik` and `logloss`.	2020-12-22 03:11:40 +08:00
MBSMachineLearning	95cbfad990	"featue_map" typo changed to "feature_map" (#6540 )	2020-12-21 22:11:11 +08:00
Philip Hyunsu Cho	fbb980d9d3	Expand `~` into the home directory on Linux and MacOS (#6531 )	2020-12-19 23:35:13 -08:00
Philip Hyunsu Cho	cd0821500c	Add Saturn Cloud Dask XGBoost tutorial to Awesome XGBoost [skip ci] (#6532 )	2020-12-19 15:57:05 -08:00
Philip Hyunsu Cho	380f6f4ab8	Remove cupy.array_equal, since it's not compatible with cuPy 7.8 (#6528 )	2020-12-18 09:16:52 -08:00
Jiaming Yuan	ca3da55de4	Support early stopping with training continuation, correct num boosted rounds. (#6506 ) * Implement early stopping with training continuation. * Add new C API for obtaining boosted rounds. * Fix off by 1 in `save_best`. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-17 19:59:19 +08:00
Philip Hyunsu Cho	125b3c0f2d	Lazy import cuDF and Dask (#6522 ) * Lazy import cuDF * Lazy import Dask Co-authored-by: PSEUDOTENSOR / Jonathan McKinney <pseudotensor@gmail.com> * Fix lint Co-authored-by: PSEUDOTENSOR / Jonathan McKinney <pseudotensor@gmail.com>	2020-12-17 01:51:35 -08:00
Philip Hyunsu Cho	ad1a527709	Enable loading model from <1.0.0 trained with objective='binary:logitraw' (#6517 ) * Enable loading model from <1.0.0 trained with objective='binary:logitraw' * Add binary:logitraw in model compatibility testing suite * Feedback from @trivialfis: Override ProbToMargin() for LogisticRaw Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-12-16 16:53:46 -08:00
Philip Hyunsu Cho	bf6cfe3b99	[Breaking] Upgrade cuDF and RMM to 0.18 nightlies; require RMM 0.18+ for RMM plugin (#6510 ) * [CI] Upgrade cuDF and RMM to 0.18 nightlies * Modify RMM plugin to be compatible with RMM 0.18 * Update src/common/device_helpers.cuh Co-authored-by: Mark Harris <mharris@nvidia.com> Co-authored-by: Mark Harris <mharris@nvidia.com>	2020-12-16 10:07:52 -08:00
Jiaming Yuan	d8d684538c	[CI] Split up main.yml, add mypy. (#6515 )	2020-12-17 00:15:44 +08:00
Jiaming Yuan	c5876277a8	Drop saving binary format for memory snapshot. (#6513 )	2020-12-17 00:14:57 +08:00
Jiaming Yuan	0e97d97d50	Fix merge conflict. (#6512 )	2020-12-16 18:02:25 +08:00
hzy001	749364f25d	Update the C API comments (#6457 ) Signed-off-by: Hao Ziyu <haoziyu@qiyi.com> Co-authored-by: Hao Ziyu <haoziyu@qiyi.com>	2020-12-16 14:56:13 +08:00
Jiaming Yuan	347f593169	Accept numpy array for DMatrix slice index. (#6368 )	2020-12-16 14:42:52 +08:00
Jiaming Yuan	ef4a0e0aac	Fix DMatrix feature names/types IO. (#6507 ) * Fix feature names/types IO Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-16 14:24:27 +08:00
Jiaming Yuan	886486a519	Support categorical data in GPU weighted sketching. (#6508 )	2020-12-16 14:23:28 +08:00
Igor Rukhovich	5c8ccf4455	Improved InitSampling function speed by 2.12 times (#6410 ) * Improved InitSampling function speed by 2.12 times * Added explicit conversion	2020-12-15 20:59:24 -08:00
Jiaming Yuan	3c3f026ec1	Move metric configuration into booster. (#6504 )	2020-12-16 05:35:04 +08:00
Jiaming Yuan	d45c0d843b	Show partition status in dask error. (#6366 )	2020-12-16 02:58:21 +08:00

... 2 3 4 5 6 ...

5344 Commits