xgboost

Author	SHA1	Message	Date
Jiaming Yuan	a9ec0ea6da	Align device id in predict transform with predictor. (#6662 )	2021-02-02 08:33:29 +08:00
Jiaming Yuan	d8ec7aad5a	[dask] Add a 1 line sample to infer output shape. (#6645 ) * [dask] Use a 1 line sample to infer output shape. This is for inferring shape with direct prediction (without DaskDMatrix). There are a few things that requires known output shape before carrying out actual prediction, including dask meta data, output dataframe columns. * Infer output shape based on local prediction. * Remove set param in predict function as it's not thread safe nor necessary as we now let dask to decide the parallelism. * Simplify prediction on `DaskDMatrix`.	2021-01-30 18:55:50 +08:00
Jiaming Yuan	c3c8e66fc9	Make prediction functions thread safe. (#6648 )	2021-01-28 23:29:43 +08:00
Philip Hyunsu Cho	0f2ed21a9d	[Breaking] Change default evaluation metric for binary:logitraw objective to logloss (#6647 )	2021-01-29 00:12:12 +09:00
Jiaming Yuan	d167892c7e	[dask] Ensure model can be pickled. (#6651 )	2021-01-28 21:47:57 +08:00
Philip Hyunsu Cho	0ad6e18a2a	[CI] Do not mix up stashed executable built for ARM and x86_64 platforms (#6646 )	2021-01-27 23:57:26 +09:00
Philip Hyunsu Cho	55ee2bd77f	[CI] Add ARM64 test to Jenkins pipeline (#6643 ) * Add ARM64 test to Jenkins pipeline * Check for bundled libgomp * Use a separate test suite for ARM64 * Ensure that x86 jobs don't run on ARM workers	2021-01-27 21:51:17 +09:00
Jiaming Yuan	1b70a323a7	Improve string view to reduce string allocation. (#6644 )	2021-01-27 19:08:52 +08:00
Jiaming Yuan	bc08e0c9d1	Remove `experimental_json_serialization` from tests. (#6640 )	2021-01-27 17:44:49 +08:00
Jiaming Yuan	8968ca7c0a	Disable s390x and arm64 tests on travis for now. (#6641 )	2021-01-27 16:21:40 +08:00
Jiaming Yuan	d19a0ddacf	Move sdist test to action. (#6635 ) * Move x86 linux and osx sdist test to action. * Add Windows.	2021-01-26 08:25:59 +08:00
Jiaming Yuan	740d042255	Add base_margin for evaluation dataset. (#6591 ) * Add base margin to evaluation datasets. * Unify the code base for evaluation matrices.	2021-01-26 02:11:02 +08:00
Jiaming Yuan	4bf23c2391	Specify shape in prediction contrib and interaction. (#6614 )	2021-01-26 02:08:22 +08:00
Jiaming Yuan	8942c98054	Define metainfo and other parameters for all DMatrix interfaces. (#6601 ) This PR ensures all DMatrix types have a common interface. * Fix logic in avoiding duplicated DMatrix in sklearn. * Check for consistency between DMatrix types. * Add doc for bounds.	2021-01-25 16:06:06 +08:00
Jiaming Yuan	561809200a	Fix document for tree methods. (#6633 )	2021-01-25 15:52:08 +08:00
Adam Pocock	fec66d033a	[jvm-packages] JVM library loader extensions (#6630 ) * [java] extending the library loader to use both OS and CPU architecture. * Simplifying create_jni.py's architecture detection. * Tidying up the architecture detection in create_jni.py	2021-01-25 15:51:39 +08:00
Jiaming Yuan	a275f40267	[dask] Rework base margin test. (#6627 )	2021-01-22 17:49:13 +08:00
Jiaming Yuan	7bc56fa0ed	Use simple print in tracker print function. (#6609 )	2021-01-21 21:15:43 +08:00
Jiaming Yuan	26982f9fce	Skip unused CMake argument in setup.py (#6611 )	2021-01-21 17:25:33 +08:00
Jiaming Yuan	f0fd7629ae	Add helper script and doc for releasing pip package. (#6613 ) * Fix `long_description_content_type`.	2021-01-21 14:46:52 +08:00
Bobby Wang	9d2832a3a3	fix potential TaskFailedListener's callback won't be called (#6612 ) there is possibility that onJobStart of TaskFailedListener won't be called, if the job is submitted before the other thread adds addSparkListener. detail can be found at https://github.com/dmlc/xgboost/pull/6019#issuecomment-760937628	2021-01-21 14:20:32 +08:00
Jiaming Yuan	f8bb678c67	Exclude dmlc test on github action. (#6625 )	2021-01-20 18:50:20 +08:00
Jiaming Yuan	d6d72de339	Revert ntree limit fix (#6616 ) The old (before fix) best_ntree_limit ignores the num_class parameters, which is incorrect. In before we workarounded it in c++ layer to avoid possible breaking changes on other language bindings. But the Python interpretation stayed incorrect. The PR fixed that in Python to consider num_class, but didn't remove the old workaround, so tree calculation in predictor is incorrect, see PredictBatch in CPUPredictor.	2021-01-19 23:51:16 +08:00
Jiaming Yuan	d132933550	Remove type check for solaris. (#6610 )	2021-01-16 02:58:19 +08:00
Jiaming Yuan	d356b7a071	Restore unknown data support. (#6595 )	2021-01-14 04:51:16 +08:00
Jiaming Yuan	89a00a5866	[dask] Random forest estimators (#6602 )	2021-01-13 20:59:20 +08:00
Jiaming Yuan	0027220aa0	[breaking] Remove duplicated predict functions, Fix attributes IO. (#6593 ) * Fix attributes not being restored. * Rename all `data` to `X`. [breaking]	2021-01-13 16:56:49 +08:00
ShvetsKS	7f4d3a91b9	Multiclass prediction caching for CPU Hist (#6550 ) Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-01-13 04:42:07 +08:00
Jiaming Yuan	03cd087da1	Remove duplicated DMatrix. (#6592 )	2021-01-12 09:36:56 +08:00
Jiaming Yuan	c709f2aaaf	Fix evaluation result for XGBRanker. (#6594 ) * Remove duplicated code, which fixes typo `evals_result` -> `evals_result_`.	2021-01-12 09:36:41 +08:00
Jiaming Yuan	f2f7dd87b8	Use view for `SparsePage` exclusively. (#6590 )	2021-01-11 18:04:55 +08:00
Jiaming Yuan	78f2cd83d7	Suppress hypothesis health check for dask client. (#6589 )	2021-01-11 14:11:57 +08:00
Jiaming Yuan	80065d571e	[dask] Add DaskXGBRanker (#6576 ) * Initial support for distributed LTR using dask. * Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.	2021-01-08 18:35:09 +08:00
Jiaming Yuan	96d3d32265	[dask] Add shap tests. (#6575 )	2021-01-08 14:59:27 +08:00
Jiaming Yuan	7c9dcbedbc	Fix `best_ntree_limit` for dart and gblinear. (#6579 )	2021-01-08 10:05:39 +08:00
Jiaming Yuan	f5ff90cd87	Support `_estimator_type`. (#6582 ) * Use `_estimator_type`. For more info, see: https://scikit-learn.org/stable/developers/develop.html#estimator-types * Model trained from dask can be loaded by single node skl interface.	2021-01-08 10:01:16 +08:00
Jiaming Yuan	8747885a8b	Support Solaris. (#6578 ) * Add system header. * Remove use of TR1 on Solaris Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2021-01-07 09:05:05 +08:00
TP Boudreau	b2246ae7ef	Update dmlc-core submodule and conform to new API (#6431 ) * Update dmlc-core submodule and conform to new API * Remove unsupported parameter from method signature * Update dmlc-core submodule and conform to new API * Update dmlc-core Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2021-01-05 16:12:22 -08:00
Jiaming Yuan	60cfd14349	[dask, sklearn] Fix predict proba. (#6566 ) * For sklearn: - Handles user defined objective function. - Handles `softmax`. * For dask: - Use the implementation from sklearn, the previous implementation doesn't perform any extra handling.	2021-01-05 08:29:06 +08:00
Jiaming Yuan	516a93d25c	Fix `best_ntree_limit`. (#6569 )	2021-01-03 05:58:54 +08:00
James Lamb	195a41cef1	[python-package] remove unnecessary files to reduce sdist size (fixes #6560 ) (#6565 )	2021-01-02 15:56:39 +08:00
Jiaming Yuan	2b049b32e9	Document various tree methods. (#6564 )	2021-01-02 15:40:46 +08:00
Philip Hyunsu Cho	fa13992264	Calling XGBModel.fit() should clear the Booster by default (#6562 ) * Calling XGBModel.fit() should clear the Booster by default * Document the behavior of fit() * Allow sklearn object to be passed in directly via xgb_model argument * Fix lint	2020-12-31 11:02:08 -08:00
Jiaming Yuan	5e9e525223	Remove warnings in tests. (#6554 )	2020-12-31 13:41:18 +08:00
James Lamb	8ad22bf4e7	Add credentials to .gitignore (#6559 )	2020-12-30 15:58:14 -08:00
Jiaming Yuan	de8fd852a5	[dask] Add type hints. (#6519 ) * Add validate_features. * Show type hints in doc. Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-12-29 19:41:02 +08:00
Jiaming Yuan	610ee632cc	[Breaking] Rename `data` to `X` in `predict_proba`. (#6555 ) New Scikit-Learn version uses keyword argument, and `X` is the predefined keyword. * Use pip to install latest Python graphviz on Windows CI.	2020-12-28 21:36:03 +08:00
Jiaming Yuan	cb207a355d	Add script for generating release tarball. (#6544 )	2020-12-23 16:08:10 +08:00
Gorkem Ozkaya	2231940d1d	Clip small positive values in gamma-nloglik (#6537 ) For the `gamma-nloglik` eval metric, small positive values in the labels are causing `NaN`'s in the outputs, as reported here: https://github.com/dmlc/xgboost/issues/5349. This will add clipping on them, similar to what is done in other metrics like `poisson-nloglik` and `logloss`.	2020-12-22 03:11:40 +08:00
MBSMachineLearning	95cbfad990	"featue_map" typo changed to "feature_map" (#6540 )	2020-12-21 22:11:11 +08:00

1 2 3 4 5 ...

5211 Commits