xgboost

Author	SHA1	Message	Date
Rory Mitchell	90cce38236	Remove single_precision_histogram for gpu_hist (#7828 )	2022-05-03 14:53:19 +02:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Jiaming Yuan	98d6faefd6	Implement slope for Pseduo-Huber. (#7727 ) * Add objective and metric. * Some refactoring for CPU/GPU dispatching using linalg module.	2022-03-14 21:42:38 +08:00
Jiaming Yuan	18a4af63aa	Update documents and tests. (#7659 ) * Revise documents after recent refactoring and cat support. * Add tests for behavior of max_depth and max_leaves.	2022-02-26 03:57:47 +08:00
Jiaming Yuan	83a66b4994	Support categorical data for hist. (#7695 ) * Extract partitioner from hist. * Implement categorical data support by passing the gradient index directly into the partitioner. * Organize/update document. * Remove code for negative hessian.	2022-02-25 03:47:14 +08:00
Jiaming Yuan	12949c6b31	[R] Implement feature weights. (#7660 )	2022-02-16 22:20:52 +08:00
Jiaming Yuan	0da7d872ef	[doc] Update for prediction. (#7648 )	2022-02-15 05:01:55 +08:00
Jiaming Yuan	0d0abe1845	Support optimal partitioning for GPU hist. (#7652 ) * Implement `MaxCategory` in quantile. * Implement partition-based split for GPU evaluation. Currently, it's based on the existing evaluation function. * Extract an evaluator from GPU Hist to store the needed states. * Added some CUDA stream/event utilities. * Update document with references. * Fixed a bug in approx evaluator where the number of data points is less than the number of categories.	2022-02-15 03:03:12 +08:00
Jiaming Yuan	001503186c	Rewrite approx (#7214 ) This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing. The rewrite has many benefits: - Support for both `max_leaves` and `max_depth`. - Support for `grow_policy`. - Support for mono constraint. - Support for feature weights. - Support for easier bin configuration (`max_bin`). - Support for categorical data. - Faster performance for most of the datasets. (many times faster) - Support for prediction cache. - Significantly better performance for external memory. - Unites the code base between approx and hist.	2022-01-10 21:15:05 +08:00
Jiaming Yuan	54582f641a	[doc] Use cross references in sphinx doc. (#7522 ) * Use cross references instead of URL. * Fix auto doc for callback.	2022-01-05 03:21:25 +08:00
Harvey	1864fab592	Minor edits to Parameters doc page. (#7500 ) * bost -> both * doc improvement * use original filename * syntax highlight false * missed a few highlights	2021-12-07 15:46:44 +08:00
Jiaming Yuan	d4349426d8	Re-implement PR-AUC. (#7297 ) * Support binary/multi-class classification, ranking. * Add documents. * Handle missing data.	2021-10-26 13:07:50 +08:00
Jiaming Yuan	864d236a82	[doc] Remove `num_pbuffer`. [skip ci] (#7356 )	2021-10-22 14:12:32 +08:00
Jiaming Yuan	376b448015	[doc] Fix broken links. (#7341 ) * Fix most of the link checks from sphinx. * Remove duplicate explicit target name.	2021-10-20 14:45:30 +08:00
Jiaming Yuan	fbb0dc4275	Remove auto configuration of seed_per_iteration. (#7009 ) * Remove auto configuration of seed_per_iteration. This should be related to model recovery from rabit, which is removed. * Document.	2021-10-17 15:58:57 +08:00
Jiaming Yuan	7a1d67f9cb	[breaking] Use integer atomic for GPU histogram. (#7180 ) On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor. [breaking] Drop non-deterministic histogram. Use fixed point for shared memory. This PR is to improve the performance of GPU Hist. Co-authored-by: Andy Adinets <aadinets@nvidia.com>	2021-08-28 05:17:05 +08:00
Jiaming Yuan	6bcbc77226	[doc] Fix typo. [skip ci] (#7170 )	2021-08-13 03:48:16 +08:00
Jiaming Yuan	7bdedacb54	Document for `process_type`. (#7135 ) * Update document for prune and refresh. * Add demo.	2021-08-03 13:11:52 +08:00
Andrew Ziem	3e7e426b36	Fix spelling in documents (#6948 ) * Update roxygen2 doc. Co-authored-by: fis <jm.yuan@outlook.com>	2021-05-11 20:44:36 +08:00
Jiaming Yuan	bcc0277338	Re-implement ROC-AUC. (#6747 ) * Re-implement ROC-AUC. * Binary * MultiClass * LTR * Add documents. This PR resolves a few issues: - Define a value when the dataset is invalid, which can happen if there's an empty dataset, or when the dataset contains only positive or negative values. - Define ROC-AUC for multi-class classification. - Define weighted average value for distributed setting. - A correct implementation for learning to rank task. Previous implementation is just binary classification with averaging across groups, which doesn't measure ordered learning to rank.	2021-03-20 16:52:40 +08:00
Philip Hyunsu Cho	366f3cb9d8	Add use_rmm flag to global configuration (#6656 ) * Ensure RMM is 0.18 or later * Add use_rmm flag to global configuration * Modify XGBCachingDeviceAllocatorImpl to skip CUB when use_rmm=True * Update the demo * [CI] Pin NumPy to 1.19.4, since NumPy 1.19.5 doesn't work with latest Shap	2021-03-09 14:53:05 -08:00
Jiaming Yuan	561809200a	Fix document for tree methods. (#6633 )	2021-01-25 15:52:08 +08:00
Jiaming Yuan	2b049b32e9	Document various tree methods. (#6564 )	2021-01-02 15:40:46 +08:00
Philip Hyunsu Cho	55bdf084cb	[Doc] Document that AUC and AUCPR are for binary classification/ranking [skip ci] (#5899 )	2020-12-06 22:17:20 -08:00
Philip Hyunsu Cho	fb56da5e8b	Add global configuration (#6414 ) * Add management functions for global configuration: XGBSetGlobalConfig(), XGBGetGlobalConfig(). * Add Python interface: set_config(), get_config(), and config_context(). * Add unit tests for Python * Add R interface: xgb.set.config(), xgb.get.config() * Add unit tests for R Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-12-03 00:05:18 -08:00
Jiaming Yuan	519cee115a	Avoid resetting seed for every configuration. (#6349 )	2020-11-06 10:28:35 +08:00
Jiaming Yuan	e8884c4637	Document tree method for feature weights. (#6312 )	2020-10-28 13:42:13 -07:00
Jiaming Yuan	81c37c28d5	Time the CPU tests on Jenkins. (#6257 ) * Time the CPU tests on Jenkins. * Reduce thread contention. * Add doc. * Skip heavy tests on ARM.	2020-10-20 17:19:07 -07:00
Christian Lorentzen	cf4f019ed6	[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183 ) * Change DefaultEvalMetric of classification from error to logloss * Change default binary metric in plugin/example/custom_obj.cc * Set old error metric in python tests * Set old error metric in R tests * Fix missed eval metrics and typos in R tests * Fix setting eval_metric twice in R tests * Add warning for empty eval_metric for classification * Fix Dask tests Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-02 12:06:47 -07:00
Philip Hyunsu Cho	33577ef5d3	Add MAPE metric (#6119 )	2020-09-14 18:45:27 -07:00
Jiaming Yuan	4d99c58a5f	Feature weights (#5962 )	2020-08-18 19:55:41 +08:00
Jiaming Yuan	c3ea3b7e37	Fix nightly build doc. [skip ci] (#6004 ) * Fix nightly build doc. [skip ci] * Fix title too short. [skip ci]	2020-08-12 15:00:40 +08:00
Jiaming Yuan	0b2a26fa74	Remove skmaker. (#5971 )	2020-08-09 15:23:31 +08:00
Jiaming Yuan	18349a7ccf	[Breaking] Fix custom metric for multi output. (#5954 ) * Set output margin to true for custom metric. This fixes only R and Python.	2020-07-29 19:25:27 +08:00
Philip Hyunsu Cho	8d7702766a	[Doc] Document new objectives and metrics available on GPUs (#5909 )	2020-07-21 02:10:59 -07:00
ShvetsKS	cd3d14ad0e	Add float32 histogram (#5624 ) * new single_precision_histogram param was added. Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com> Co-authored-by: fis <jm.yuan@outlook.com>	2020-06-03 11:24:53 +08:00
LionOrCatThatIsTheQuestion	83981a9ce3	Pseudo-huber loss metric added (#5647 ) - Add pseudo huber loss objective. - Add pseudo huber loss metric. Co-authored-by: Reetz <s02reetz@iavgroup.local>	2020-05-18 21:08:07 +08:00
Jiaming Yuan	c90457f489	Refactor the CLI. (#5574 ) * Enable parameter validation. * Enable JSON. * Catch `dmlc::Error`. * Show help message.	2020-04-26 10:56:33 +08:00
Jiaming Yuan	c355ab65ed	Enable parameter validation for R. (#5569 ) * Enable parameter validation for R. * Add test.	2020-04-21 11:19:09 -07:00
Jiaming Yuan	bb29ce2818	Add missing aft parameters. [skip ci] (#5553 )	2020-04-16 12:08:55 -07:00
Jiaming Yuan	4a0c8ef237	Update doc for parameter validation. (#5508 ) * Update doc for parameter validation. * Fix github rebase.	2020-04-11 00:43:46 +08:00
Jiaming Yuan	bd653fad4c	Remove distcol updater. (#5507 ) Closes #5498.	2020-04-10 12:52:56 +08:00
Jiaming Yuan	d0b86c75d9	Remove silent parameter. (#5476 )	2020-04-03 08:03:26 +08:00
Jiaming Yuan	8d06878bf9	Deterministic GPU histogram. (#5361 ) * Use pre-rounding based method to obtain reproducible floating point summation. * GPU Hist for regression and classification are bit-by-bit reproducible. * Add doc. * Switch to thrust reduce for `node_sum_gradient`.	2020-03-04 15:13:28 +08:00
Rong Ou	d6b31df449	update docs for gpu external memory (#5332 ) * update docs for gpu external memory * add hist limitation	2020-02-22 14:57:40 +08:00
Jiaming Yuan	40680368cf	Add constraint parameters to Scikit-Learn interface. (#5227 ) * Add document for constraints. * Fix a format error in doc for objective function.	2020-01-25 11:12:02 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Jiaming Yuan	ebc86a3afa	Disable parameter validation for Scikit-Learn interface. (#5167 ) * Disable parameter validation for now. Scikit-Learn passes all parameters down to XGBoost, whether they are used or not. * Add option `validate_parameters`.	2020-01-07 11:17:31 +08:00
Jiaming Yuan	63ffd2f686	Check against R seed. (#5125 ) * Handle it in R instead.	2019-12-17 19:14:59 +08:00
Jiaming Yuan	38763aa4fa	Update document for tree_method. [skip ci] (#5106 )	2019-12-09 22:55:00 +08:00

1 2

81 Commits