xgboost

Author	SHA1	Message	Date
Rong Ou	80339c3427	Enable distributed GPU training over Rabit (#7930 )	2022-05-31 04:09:45 +08:00
Philip Hyunsu Cho	6f424d8d6c	[Doc] Warn against loading JSON from external source (#7918 )	2022-05-18 17:02:36 -07:00
Bobby Wang	1496789561	[doc] update the doc for jvm model compatibility (#7907 )	2022-05-16 14:05:26 +08:00
Sze Yeung	a06d53688c	Correct a mistake in Setting Parameters section (#7905 )	2022-05-15 18:56:31 -07:00
Philip Hyunsu Cho	4cd14aee5a	Rename misspelled config parameter for pseudo-Huber (#7904 )	2022-05-15 06:38:33 -07:00
Jiaming Yuan	1b6538b4e5	[breaking] Drop single precision histogram (#7892 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-05-13 19:54:55 +08:00
Jiaming Yuan	8ab5e13b5d	Fix typo [skip ci] (#7861 )	2022-05-04 18:34:45 +08:00
Jiaming Yuan	317d7be6ee	Always use partition based categorical splits. (#7857 )	2022-05-03 22:30:32 +08:00
Rory Mitchell	90cce38236	Remove single_precision_histogram for gpu_hist (#7828 )	2022-05-03 14:53:19 +02:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Bobby Wang	bef1f939ce	[doc] remove the doc about killing SparkContext [skip ci] (#7840 )	2022-04-25 19:29:16 +08:00
Bobby Wang	6ece549a90	[doc] update the jvm tutorial to 1.6.1 [skip ci] (#7834 )	2022-04-24 14:25:22 +08:00
forestkey	c13a2a3114	[doc] "irrevelant" to "irrelevant" (#7832 )	2022-04-22 16:54:30 +08:00
Jiaming Yuan	52d4eda786	Deprecate `use_label_encoder` in XGBClassifier. (#7822 ) * Deprecate `use_label_encoder` in XGBClassifier. * We have removed the encoder, now prepare to remove the indicator.	2022-04-21 13:14:02 +08:00
Bobby Wang	6f032b7152	[doc] fix a typo in jvm/index.rst (#7806 )	2022-04-13 17:02:42 -07:00
Ikko Ashimine	56e4baff7c	[doc] Fix typo in build.rst (#7800 ) avaiable -> available	2022-04-13 16:45:26 +08:00
Bobby Wang	4b00c64d96	[doc] improve xgboost4j-spark-gpu doc [skip ci] (#7793 ) Co-authored-by: Sameer Raheja <sameerz@users.noreply.github.com>	2022-04-12 12:02:16 +08:00
Bobby Wang	89d6419fd5	[jvm-packages] add doc for xgboost4j-spark-gpu (#7779 ) Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2022-04-07 11:35:01 +08:00
Jiaming Yuan	bcce17e688	Remove text loading in basic walk through demo. (#7753 )	2022-04-01 00:59:42 +08:00
giuliohome	c467e90ac1	[doc] Update doc for Kubernetes Operator (#7777 )	2022-03-31 23:10:49 +08:00
Jiaming Yuan	522636cb52	Bump version. (#7769 )	2022-03-31 06:33:22 +08:00
Jiaming Yuan	a50b84244e	Cleanup configuration for constraints. (#7758 )	2022-03-29 04:22:46 +08:00
Jiaming Yuan	3c9b04460a	Move `num_parallel_tree` to model parameter. (#7751 ) The size of forest should be a property of model itself instead of a training hyper-parameter.	2022-03-29 02:32:42 +08:00
Philip Hyunsu Cho	66cb4afc6c	Update install doc (#7747 )	2022-03-23 17:20:01 +08:00
Jiaming Yuan	4d81c741e9	External memory support for hist (#7531 ) * Generate column matrix from gHistIndex. * Avoid synchronization with the sparse page once the cache is written. * Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist. * Remove pruner.	2022-03-22 00:13:20 +08:00
Jiaming Yuan	98d6faefd6	Implement slope for Pseduo-Huber. (#7727 ) * Add objective and metric. * Some refactoring for CPU/GPU dispatching using linalg module.	2022-03-14 21:42:38 +08:00
Jiaming Yuan	18a4af63aa	Update documents and tests. (#7659 ) * Revise documents after recent refactoring and cat support. * Add tests for behavior of max_depth and max_leaves.	2022-02-26 03:57:47 +08:00
Jiaming Yuan	83a66b4994	Support categorical data for hist. (#7695 ) * Extract partitioner from hist. * Implement categorical data support by passing the gradient index directly into the partitioner. * Organize/update document. * Remove code for negative hessian.	2022-02-25 03:47:14 +08:00
Jiaming Yuan	49c74a5369	Update R package description. (#7691 ) * Change role. * Remove cmake file when building the package.	2022-02-23 08:36:37 +08:00
Jiaming Yuan	584bae1fc6	Fix document build with scikit-learn (#7684 ) * Require sphinx >= 4.4 for RTD. * Install sklearn.	2022-02-22 08:58:54 +08:00
Jiaming Yuan	14d61b0141	[doc] Update document for building from source. (#7664 ) - Mention standard install command for R package. - Remove repeated "get source" step. - Remove troubleshooting on Windows. It's outdated considering VS 2022 is already out.	2022-02-19 04:57:03 +08:00
Jiaming Yuan	12949c6b31	[R] Implement feature weights. (#7660 )	2022-02-16 22:20:52 +08:00
Jiaming Yuan	93eebe8664	[doc] Fix broken link. [skip ci] (#7655 )	2022-02-15 14:07:34 +08:00
Jiaming Yuan	0da7d872ef	[doc] Update for prediction. (#7648 )	2022-02-15 05:01:55 +08:00
Jiaming Yuan	0d0abe1845	Support optimal partitioning for GPU hist. (#7652 ) * Implement `MaxCategory` in quantile. * Implement partition-based split for GPU evaluation. Currently, it's based on the existing evaluation function. * Extract an evaluator from GPU Hist to store the needed states. * Added some CUDA stream/event utilities. * Update document with references. * Fixed a bug in approx evaluator where the number of data points is less than the number of categories.	2022-02-15 03:03:12 +08:00
Jiaming Yuan	5cd1f71b51	[dask] Improve configuration for port. (#7645 ) - Try port 0 to let the OS return the available port. - Add port configuration.	2022-02-14 21:34:34 +08:00
Philip Hyunsu Cho	f6e6d0b2c0	[CI] Build Python wheels for MacOS (x86_64 and arm64) (#7621 ) * Build Python wheels for OSX (x86_64 and arm64) * Use Conda's libomp when running Python tests * fix * Add comment to explain CIBW_TARGET_OSX_ARM64 * Update release script * Add comments in build_python_wheels.sh * Document wheel pipeline	2022-02-02 17:35:48 -08:00
Philip Hyunsu Cho	271a7c5d43	[Doc] fix typo in install doc (#7623 )	2022-01-31 13:35:56 -08:00
Philip Hyunsu Cho	f21301c749	[Doc] Add instruction to install XGBoost for Apple Silicon using Conda (#7612 )	2022-01-28 01:06:39 -08:00
Jiaming Yuan	ef4dae4c0e	[dask] Add scheduler address to dask config. (#7581 ) - Add user configuration. - Bring back to the logic of using scheduler address from dask. This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.	2022-01-22 01:56:32 +08:00
Jiaming Yuan	b4ec1682c6	Update document for multi output and categorical. (#7574 ) * Group together categorical related parameters. * Update documents about multioutput and categorical.	2022-01-19 04:35:17 +08:00
Jiaming Yuan	dac9eb13bd	Implement new `save_raw` in Python. (#7572 ) * Expose the new C API function to Python. * Remove old document and helper script. * Small optimization to the `save_raw` and Json ctors.	2022-01-19 02:27:51 +08:00
Jiaming Yuan	deab0e32ba	Validate out of range categorical value. (#7576 ) * Use float in CPU categorical set to preserve the input value. * Check out of range values.	2022-01-18 20:16:19 +08:00
Jiaming Yuan	a1bcd33a3b	[breaking] Change internal model serialization to UBJSON. (#7556 ) * Use typed array for models. * Change the memory snapshot format. * Add new C API for saving to raw format.	2022-01-16 02:11:53 +08:00
Jiaming Yuan	e5e47c3c99	Clarify the behavior of invalid categorical value handling. (#7529 )	2022-01-13 16:11:52 +08:00
Jiaming Yuan	001503186c	Rewrite approx (#7214 ) This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing. The rewrite has many benefits: - Support for both `max_leaves` and `max_depth`. - Support for `grow_policy`. - Support for mono constraint. - Support for feature weights. - Support for easier bin configuration (`max_bin`). - Support for categorical data. - Faster performance for most of the datasets. (many times faster) - Support for prediction cache. - Significantly better performance for external memory. - Unites the code base between approx and hist.	2022-01-10 21:15:05 +08:00
Jiaming Yuan	ec56d5869b	[doc] Include dask examples into doc. (#7530 )	2022-01-05 03:27:22 +08:00
Jiaming Yuan	54582f641a	[doc] Use cross references in sphinx doc. (#7522 ) * Use cross references instead of URL. * Fix auto doc for callback.	2022-01-05 03:21:25 +08:00
Jiaming Yuan	8f0a42a266	Initial support for multi-label classification. (#7521 ) * Add support in sklearn classifier.	2022-01-04 23:58:21 +08:00
Randall Britten	a4a0ebb85d	[doc] Lowercase omega for per tree complexity (#7532 ) As suggested on issue #7480	2021-12-29 23:05:54 +08:00

... 2 3 4 5 6 ...

712 Commits