xgboost

Author	SHA1	Message	Date
Jiaming Yuan	4d81c741e9	External memory support for hist (#7531 ) * Generate column matrix from gHistIndex. * Avoid synchronization with the sparse page once the cache is written. * Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist. * Remove pruner.	2022-03-22 00:13:20 +08:00
Jiaming Yuan	98d6faefd6	Implement slope for Pseduo-Huber. (#7727 ) * Add objective and metric. * Some refactoring for CPU/GPU dispatching using linalg module.	2022-03-14 21:42:38 +08:00
Jiaming Yuan	18a4af63aa	Update documents and tests. (#7659 ) * Revise documents after recent refactoring and cat support. * Add tests for behavior of max_depth and max_leaves.	2022-02-26 03:57:47 +08:00
Jiaming Yuan	83a66b4994	Support categorical data for hist. (#7695 ) * Extract partitioner from hist. * Implement categorical data support by passing the gradient index directly into the partitioner. * Organize/update document. * Remove code for negative hessian.	2022-02-25 03:47:14 +08:00
Jiaming Yuan	49c74a5369	Update R package description. (#7691 ) * Change role. * Remove cmake file when building the package.	2022-02-23 08:36:37 +08:00
Jiaming Yuan	584bae1fc6	Fix document build with scikit-learn (#7684 ) * Require sphinx >= 4.4 for RTD. * Install sklearn.	2022-02-22 08:58:54 +08:00
Jiaming Yuan	14d61b0141	[doc] Update document for building from source. (#7664 ) - Mention standard install command for R package. - Remove repeated "get source" step. - Remove troubleshooting on Windows. It's outdated considering VS 2022 is already out.	2022-02-19 04:57:03 +08:00
Jiaming Yuan	12949c6b31	[R] Implement feature weights. (#7660 )	2022-02-16 22:20:52 +08:00
Jiaming Yuan	93eebe8664	[doc] Fix broken link. [skip ci] (#7655 )	2022-02-15 14:07:34 +08:00
Jiaming Yuan	0da7d872ef	[doc] Update for prediction. (#7648 )	2022-02-15 05:01:55 +08:00
Jiaming Yuan	0d0abe1845	Support optimal partitioning for GPU hist. (#7652 ) * Implement `MaxCategory` in quantile. * Implement partition-based split for GPU evaluation. Currently, it's based on the existing evaluation function. * Extract an evaluator from GPU Hist to store the needed states. * Added some CUDA stream/event utilities. * Update document with references. * Fixed a bug in approx evaluator where the number of data points is less than the number of categories.	2022-02-15 03:03:12 +08:00
Jiaming Yuan	5cd1f71b51	[dask] Improve configuration for port. (#7645 ) - Try port 0 to let the OS return the available port. - Add port configuration.	2022-02-14 21:34:34 +08:00
Philip Hyunsu Cho	f6e6d0b2c0	[CI] Build Python wheels for MacOS (x86_64 and arm64) (#7621 ) * Build Python wheels for OSX (x86_64 and arm64) * Use Conda's libomp when running Python tests * fix * Add comment to explain CIBW_TARGET_OSX_ARM64 * Update release script * Add comments in build_python_wheels.sh * Document wheel pipeline	2022-02-02 17:35:48 -08:00
Philip Hyunsu Cho	271a7c5d43	[Doc] fix typo in install doc (#7623 )	2022-01-31 13:35:56 -08:00
Philip Hyunsu Cho	f21301c749	[Doc] Add instruction to install XGBoost for Apple Silicon using Conda (#7612 )	2022-01-28 01:06:39 -08:00
Jiaming Yuan	ef4dae4c0e	[dask] Add scheduler address to dask config. (#7581 ) - Add user configuration. - Bring back to the logic of using scheduler address from dask. This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.	2022-01-22 01:56:32 +08:00
Jiaming Yuan	b4ec1682c6	Update document for multi output and categorical. (#7574 ) * Group together categorical related parameters. * Update documents about multioutput and categorical.	2022-01-19 04:35:17 +08:00
Jiaming Yuan	dac9eb13bd	Implement new `save_raw` in Python. (#7572 ) * Expose the new C API function to Python. * Remove old document and helper script. * Small optimization to the `save_raw` and Json ctors.	2022-01-19 02:27:51 +08:00
Jiaming Yuan	deab0e32ba	Validate out of range categorical value. (#7576 ) * Use float in CPU categorical set to preserve the input value. * Check out of range values.	2022-01-18 20:16:19 +08:00
Jiaming Yuan	a1bcd33a3b	[breaking] Change internal model serialization to UBJSON. (#7556 ) * Use typed array for models. * Change the memory snapshot format. * Add new C API for saving to raw format.	2022-01-16 02:11:53 +08:00
Jiaming Yuan	e5e47c3c99	Clarify the behavior of invalid categorical value handling. (#7529 )	2022-01-13 16:11:52 +08:00
Jiaming Yuan	001503186c	Rewrite approx (#7214 ) This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing. The rewrite has many benefits: - Support for both `max_leaves` and `max_depth`. - Support for `grow_policy`. - Support for mono constraint. - Support for feature weights. - Support for easier bin configuration (`max_bin`). - Support for categorical data. - Faster performance for most of the datasets. (many times faster) - Support for prediction cache. - Significantly better performance for external memory. - Unites the code base between approx and hist.	2022-01-10 21:15:05 +08:00
Jiaming Yuan	ec56d5869b	[doc] Include dask examples into doc. (#7530 )	2022-01-05 03:27:22 +08:00
Jiaming Yuan	54582f641a	[doc] Use cross references in sphinx doc. (#7522 ) * Use cross references instead of URL. * Fix auto doc for callback.	2022-01-05 03:21:25 +08:00
Jiaming Yuan	8f0a42a266	Initial support for multi-label classification. (#7521 ) * Add support in sklearn classifier.	2022-01-04 23:58:21 +08:00
Randall Britten	a4a0ebb85d	[doc] Lowercase omega for per tree complexity (#7532 ) As suggested on issue #7480	2021-12-29 23:05:54 +08:00
Jiaming Yuan	a512b4b394	[doc] Promote dask from experimental. [skip ci] (#7509 )	2021-12-16 14:17:06 +08:00
Harvey	1864fab592	Minor edits to Parameters doc page. (#7500 ) * bost -> both * doc improvement * use original filename * syntax highlight false * missed a few highlights	2021-12-07 15:46:44 +08:00
danmarinescu	6f38f5affa	Updated CMake version requirement in build.rst (#7487 ) The documentation states that to build from source you need CMake 3.13 or higher. However, according to https://github.com/dmlc/xgboost/blob/master/CMakeLists.txt#L1 CMake 3.14 or higher is required.	2021-11-27 09:58:01 +08:00
Jiaming Yuan	c024c42dce	Modernize XGBoost Python document. (#7468 ) * Use sphinx gallery to integrate examples. * Remove mock objects. * Add dask doc inventory.	2021-11-23 23:24:52 +08:00
Jiaming Yuan	d33854af1b	[Breaking] Accept multi-dim meta info. (#7405 ) This PR changes base_margin into a 3-dim array, with one of them being reserved for multi-target classification. Also, a breaking change is made for binary serialization due to extra dimension along with a fix for saving the feature weights. Lastly, it unifies the prediction initialization between CPU and GPU. After this PR, the meta info setter in Python will be based on array interface.	2021-11-18 23:02:54 +08:00
Philip Hyunsu Cho	2adf222fb2	[CI] CI cost saving (#7407 ) * [CI] Drop CUDA 10.1; Require 11.0 * Change NCCL version * Use CUDA 10.1 for clang-tidy, for now * Remove JDK 11 and 12 * Fix NCCL version * Don't require 11.0 just yet, until clang-tidy is fixed * Skip MultiClassesSerializationTest.GpuHist	2021-11-17 21:02:20 -08:00
Jiaming Yuan	97d7582457	Delay breaking changes to 1.6. (#7420 ) The patch is too big to be backported.	2021-11-12 16:46:03 +08:00
Jiaming Yuan	8df0a252b7	[doc] Update document for GPU. [skip ci] (#7403 ) * Remove outdated workaround and description.	2021-11-09 02:05:55 +08:00
Jiaming Yuan	c968217ca8	[R] Fix global feature importance and predict with 1 sample. (#7394 ) * [R] Fix global feature importance. * Add implementation for tree index. The parameter is not documented in C API since we should work on porting the model slicing to R instead of supporting more use of tree index. * Fix the difference between "gain" and "total_gain". * debug. * Fix prediction.	2021-11-05 10:07:00 +08:00
Jiaming Yuan	48aff0eabd	[doc][jvm-packages] Update information about Python tracker. [skip ci] (#7396 )	2021-11-05 05:55:13 +08:00
Jiaming Yuan	232144ca09	Add note about CRAN release [skip ci] (#7395 )	2021-11-05 00:34:14 +08:00
Jiaming Yuan	32e673d8c4	Support building with CTK11.5. (#7379 ) * Support building with CTK11.5. * Require system cub installation for CTK11.4+. * Check thrust version for segmented sort.	2021-11-02 16:22:26 +08:00
Jiaming Yuan	45aef75cca	Move skl `eval_metric` and `early_stopping rounds` to model params. (#6751 ) A new parameter `custom_metric` is added to `train` and `cv` to distinguish the behaviour from the old `feval`. And `feval` is deprecated. The new `custom_metric` receives transformed prediction when the built-in objective is used. This enables XGBoost to use cost functions from other libraries like scikit-learn directly without going through the definition of the link function. `eval_metric` and `early_stopping_rounds` in sklearn interface are moved from `fit` to `__init__` and is now saved as part of the scikit-learn model. The old ones in `fit` function are now deprecated. The new `eval_metric` in `__init__` has the same new behaviour as `custom_metric`. Added more detailed documents for the behaviour of custom objective and metric.	2021-10-28 17:20:20 +08:00
Jiaming Yuan	b9414b6477	Update GPU doc for PR-AUC. [skip ci] (#7368 )	2021-10-27 16:31:07 +08:00
Jiaming Yuan	d4349426d8	Re-implement PR-AUC. (#7297 ) * Support binary/multi-class classification, ranking. * Add documents. * Handle missing data.	2021-10-26 13:07:50 +08:00
Jiaming Yuan	e36b066344	[doc] Document the status of RTD hosting. [skip ci] (#7353 )	2021-10-22 14:12:55 +08:00
Jiaming Yuan	864d236a82	[doc] Remove `num_pbuffer`. [skip ci] (#7356 )	2021-10-22 14:12:32 +08:00
Jiaming Yuan	15685996fc	[doc] Small improvements for categorical data document. (#7330 )	2021-10-20 18:04:32 +08:00
Philip Hyunsu Cho	b8e8f0fcd9	[doc] Use latest Sphinx RTD theme (#7347 )	2021-10-20 00:04:43 -07:00
Jiaming Yuan	3b0b74fa94	[doc] Use RTD theme. (#7346 )	2021-10-19 23:49:19 -07:00
Jiaming Yuan	376b448015	[doc] Fix broken links. (#7341 ) * Fix most of the link checks from sphinx. * Remove duplicate explicit target name.	2021-10-20 14:45:30 +08:00
Jiaming Yuan	5ff210ed75	Small fix for the release doc and script. [skip ci] (#7332 ) Add Philip as co-maintainer of maven packages.	2021-10-20 12:49:12 +08:00
Jiaming Yuan	fbb0dc4275	Remove auto configuration of seed_per_iteration. (#7009 ) * Remove auto configuration of seed_per_iteration. This should be related to model recovery from rabit, which is removed. * Document.	2021-10-17 15:58:57 +08:00
Jiaming Yuan	e6a142fe70	Fix document about best_iteration (#7324 )	2021-10-14 15:30:46 -07:00

1 2 3 4 5 ...

638 Commits