WeichenXu
f23cc92130
[pyspark] User guide doc and tutorials ( #8082 )
...
Co-authored-by: Bobby Wang <wbo4958@gmail.com>
2022-07-19 22:25:14 +08:00
Jiaming Yuan
e28f6f6657
[doc] Integrate pyspark module into sphinx doc [skip ci] ( #8066 )
2022-07-17 10:46:09 +08:00
Rong Ou
e5ec546da5
[Breaking] Remove rabit support for custom reductions and grow_local_histmaker updater ( #7992 )
2022-06-21 15:08:23 +08:00
Philip Hyunsu Cho
1ced638165
Document how to reproduce Docker environment from Jenkins ( #7971 )
2022-06-04 20:56:53 +09:00
Jiaming Yuan
b90c6d25e8
Implement max_cat_threshold for CPU. ( #7957 )
2022-06-04 11:02:46 +08:00
Bobby Wang
5a7dc41351
[doc] update doc for dumping model to be json or ubj for jvm packages ( #7955 )
2022-05-31 14:43:13 +08:00
Rong Ou
80339c3427
Enable distributed GPU training over Rabit ( #7930 )
2022-05-31 04:09:45 +08:00
Philip Hyunsu Cho
6f424d8d6c
[Doc] Warn against loading JSON from external source ( #7918 )
2022-05-18 17:02:36 -07:00
Bobby Wang
1496789561
[doc] update the doc for jvm model compatibility ( #7907 )
2022-05-16 14:05:26 +08:00
Sze Yeung
a06d53688c
Correct a mistake in Setting Parameters section ( #7905 )
2022-05-15 18:56:31 -07:00
Philip Hyunsu Cho
4cd14aee5a
Rename misspelled config parameter for pseudo-Huber ( #7904 )
2022-05-15 06:38:33 -07:00
Jiaming Yuan
1b6538b4e5
[breaking] Drop single precision histogram ( #7892 )
...
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2022-05-13 19:54:55 +08:00
Jiaming Yuan
8ab5e13b5d
Fix typo [skip ci] ( #7861 )
2022-05-04 18:34:45 +08:00
Jiaming Yuan
317d7be6ee
Always use partition based categorical splits. ( #7857 )
2022-05-03 22:30:32 +08:00
Rory Mitchell
90cce38236
Remove single_precision_histogram for gpu_hist ( #7828 )
2022-05-03 14:53:19 +02:00
Jiaming Yuan
fdf533f2b9
[POC] Experimental support for l1 error. ( #7812 )
...
Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction.
For l1 error, the optimal value is the median (50 percentile).
This is marked as experimental support for the following reasons:
- The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors.
- Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.
2022-04-26 21:41:55 +08:00
Bobby Wang
bef1f939ce
[doc] remove the doc about killing SparkContext [skip ci] ( #7840 )
2022-04-25 19:29:16 +08:00
Bobby Wang
6ece549a90
[doc] update the jvm tutorial to 1.6.1 [skip ci] ( #7834 )
2022-04-24 14:25:22 +08:00
forestkey
c13a2a3114
[doc] "irrevelant" to "irrelevant" ( #7832 )
2022-04-22 16:54:30 +08:00
Jiaming Yuan
52d4eda786
Deprecate use_label_encoder in XGBClassifier. ( #7822 )
...
* Deprecate `use_label_encoder` in XGBClassifier.
* We have removed the encoder, now prepare to remove the indicator.
2022-04-21 13:14:02 +08:00
Bobby Wang
6f032b7152
[doc] fix a typo in jvm/index.rst ( #7806 )
2022-04-13 17:02:42 -07:00
Ikko Ashimine
56e4baff7c
[doc] Fix typo in build.rst ( #7800 )
...
avaiable -> available
2022-04-13 16:45:26 +08:00
Bobby Wang
4b00c64d96
[doc] improve xgboost4j-spark-gpu doc [skip ci] ( #7793 )
...
Co-authored-by: Sameer Raheja <sameerz@users.noreply.github.com>
2022-04-12 12:02:16 +08:00
Bobby Wang
89d6419fd5
[jvm-packages] add doc for xgboost4j-spark-gpu ( #7779 )
...
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2022-04-07 11:35:01 +08:00
Jiaming Yuan
bcce17e688
Remove text loading in basic walk through demo. ( #7753 )
2022-04-01 00:59:42 +08:00
giuliohome
c467e90ac1
[doc] Update doc for Kubernetes Operator ( #7777 )
2022-03-31 23:10:49 +08:00
Jiaming Yuan
522636cb52
Bump version. ( #7769 )
2022-03-31 06:33:22 +08:00
Jiaming Yuan
a50b84244e
Cleanup configuration for constraints. ( #7758 )
2022-03-29 04:22:46 +08:00
Jiaming Yuan
3c9b04460a
Move num_parallel_tree to model parameter. ( #7751 )
...
The size of forest should be a property of model itself instead of a training
hyper-parameter.
2022-03-29 02:32:42 +08:00
Philip Hyunsu Cho
66cb4afc6c
Update install doc ( #7747 )
2022-03-23 17:20:01 +08:00
Jiaming Yuan
4d81c741e9
External memory support for hist ( #7531 )
...
* Generate column matrix from gHistIndex.
* Avoid synchronization with the sparse page once the cache is written.
* Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist.
* Remove pruner.
2022-03-22 00:13:20 +08:00
Jiaming Yuan
98d6faefd6
Implement slope for Pseduo-Huber. ( #7727 )
...
* Add objective and metric.
* Some refactoring for CPU/GPU dispatching using linalg module.
2022-03-14 21:42:38 +08:00
Jiaming Yuan
18a4af63aa
Update documents and tests. ( #7659 )
...
* Revise documents after recent refactoring and cat support.
* Add tests for behavior of max_depth and max_leaves.
2022-02-26 03:57:47 +08:00
Jiaming Yuan
83a66b4994
Support categorical data for hist. ( #7695 )
...
* Extract partitioner from hist.
* Implement categorical data support by passing the gradient index directly into the partitioner.
* Organize/update document.
* Remove code for negative hessian.
2022-02-25 03:47:14 +08:00
Jiaming Yuan
49c74a5369
Update R package description. ( #7691 )
...
* Change role.
* Remove cmake file when building the package.
2022-02-23 08:36:37 +08:00
Jiaming Yuan
584bae1fc6
Fix document build with scikit-learn ( #7684 )
...
* Require sphinx >= 4.4 for RTD.
* Install sklearn.
2022-02-22 08:58:54 +08:00
Jiaming Yuan
14d61b0141
[doc] Update document for building from source. ( #7664 )
...
- Mention standard install command for R package.
- Remove repeated "get source" step.
- Remove troubleshooting on Windows. It's outdated considering VS 2022 is already out.
2022-02-19 04:57:03 +08:00
Jiaming Yuan
12949c6b31
[R] Implement feature weights. ( #7660 )
2022-02-16 22:20:52 +08:00
Jiaming Yuan
93eebe8664
[doc] Fix broken link. [skip ci] ( #7655 )
2022-02-15 14:07:34 +08:00
Jiaming Yuan
0da7d872ef
[doc] Update for prediction. ( #7648 )
2022-02-15 05:01:55 +08:00
Jiaming Yuan
0d0abe1845
Support optimal partitioning for GPU hist. ( #7652 )
...
* Implement `MaxCategory` in quantile.
* Implement partition-based split for GPU evaluation. Currently, it's based on the existing evaluation function.
* Extract an evaluator from GPU Hist to store the needed states.
* Added some CUDA stream/event utilities.
* Update document with references.
* Fixed a bug in approx evaluator where the number of data points is less than the number of categories.
2022-02-15 03:03:12 +08:00
Jiaming Yuan
5cd1f71b51
[dask] Improve configuration for port. ( #7645 )
...
- Try port 0 to let the OS return the available port.
- Add port configuration.
2022-02-14 21:34:34 +08:00
Philip Hyunsu Cho
f6e6d0b2c0
[CI] Build Python wheels for MacOS (x86_64 and arm64) ( #7621 )
...
* Build Python wheels for OSX (x86_64 and arm64)
* Use Conda's libomp when running Python tests
* fix
* Add comment to explain CIBW_TARGET_OSX_ARM64
* Update release script
* Add comments in build_python_wheels.sh
* Document wheel pipeline
2022-02-02 17:35:48 -08:00
Philip Hyunsu Cho
271a7c5d43
[Doc] fix typo in install doc ( #7623 )
2022-01-31 13:35:56 -08:00
Philip Hyunsu Cho
f21301c749
[Doc] Add instruction to install XGBoost for Apple Silicon using Conda ( #7612 )
2022-01-28 01:06:39 -08:00
Jiaming Yuan
ef4dae4c0e
[dask] Add scheduler address to dask config. ( #7581 )
...
- Add user configuration.
- Bring back to the logic of using scheduler address from dask. This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.
2022-01-22 01:56:32 +08:00
Jiaming Yuan
b4ec1682c6
Update document for multi output and categorical. ( #7574 )
...
* Group together categorical related parameters.
* Update documents about multioutput and categorical.
2022-01-19 04:35:17 +08:00
Jiaming Yuan
dac9eb13bd
Implement new save_raw in Python. ( #7572 )
...
* Expose the new C API function to Python.
* Remove old document and helper script.
* Small optimization to the `save_raw` and Json ctors.
2022-01-19 02:27:51 +08:00
Jiaming Yuan
deab0e32ba
Validate out of range categorical value. ( #7576 )
...
* Use float in CPU categorical set to preserve the input value.
* Check out of range values.
2022-01-18 20:16:19 +08:00
Jiaming Yuan
a1bcd33a3b
[breaking] Change internal model serialization to UBJSON. ( #7556 )
...
* Use typed array for models.
* Change the memory snapshot format.
* Add new C API for saving to raw format.
2022-01-16 02:11:53 +08:00