Philip Hyunsu Cho
9fd510faa5
[CI] Clarify steps for publishing artifacts to Maven Central ( #7582 )
2022-01-20 14:23:07 -08:00
Jiaming Yuan
529cf8a54a
Configure cub version automatically. ( #7579 )
...
Note that when cub inside CUDA is being used, XGBoost performs checks on input size
instead of using internal cub function to accept inputs larger than maximum integer.
2022-01-20 19:49:26 +08:00
Jiaming Yuan
ac7a36367c
[jvm-packages] Implement new save_raw in jvm-packages. ( #7570 )
...
* New `toByteArray` that accepts a parameter for format.
2022-01-19 16:00:14 +08:00
Jiaming Yuan
b4ec1682c6
Update document for multi output and categorical. ( #7574 )
...
* Group together categorical related parameters.
* Update documents about multioutput and categorical.
2022-01-19 04:35:17 +08:00
Jiaming Yuan
dac9eb13bd
Implement new save_raw in Python. ( #7572 )
...
* Expose the new C API function to Python.
* Remove old document and helper script.
* Small optimization to the `save_raw` and Json ctors.
2022-01-19 02:27:51 +08:00
Jiaming Yuan
9f20a3315e
Test with latest numpy. ( #7573 )
2022-01-19 00:46:23 +08:00
Jiaming Yuan
bb56bb9a13
Fix merge conflict. ( #7577 )
2022-01-18 23:01:34 +08:00
Jiaming Yuan
cc06fab9a7
Support distributed CPU env for categorical data. ( #7575 )
...
* Add support for cat data in sketch allreduce.
* Share tests between CPU and GPU.
2022-01-18 21:56:07 +08:00
Jiaming Yuan
deab0e32ba
Validate out of range categorical value. ( #7576 )
...
* Use float in CPU categorical set to preserve the input value.
* Check out of range values.
2022-01-18 20:16:19 +08:00
Jiaming Yuan
d6ea5cc1ed
Cover approx tree method for categorical data tests. ( #7569 )
...
* Add tree to df tests.
* Add plotting tests.
* Add histogram tests.
2022-01-16 11:31:40 +08:00
Jiaming Yuan
465dc63833
Fix tree param feature type. ( #7565 )
2022-01-16 04:46:29 +08:00
Jiaming Yuan
a1bcd33a3b
[breaking] Change internal model serialization to UBJSON. ( #7556 )
...
* Use typed array for models.
* Change the memory snapshot format.
* Add new C API for saving to raw format.
2022-01-16 02:11:53 +08:00
Jiaming Yuan
13b0fa4b97
Implement get_group. ( #7564 )
2022-01-16 02:07:42 +08:00
Jiaming Yuan
52277cc3da
Rename build info function to be consistent with rest of the API. ( #7553 )
2022-01-14 00:39:28 +08:00
Jiaming Yuan
e94b766310
Fix early stopping with linear model. ( #7554 )
2022-01-13 21:53:06 +08:00
Jiaming Yuan
e5e47c3c99
Clarify the behavior of invalid categorical value handling. ( #7529 )
2022-01-13 16:11:52 +08:00
Philip Hyunsu Cho
20c0d60ac7
Restore functionality of max_depth=0 in hist ( #7551 )
...
* Restore functionality of max_depth=0 in hist
* Add test case
2022-01-11 01:37:44 +08:00
Jiaming Yuan
2db808021d
Silent some warnings for unused variable. ( #7548 )
2022-01-11 01:16:26 +08:00
Jiaming Yuan
c635d4c46a
Implement ubjson. ( #7549 )
...
* Implement ubjson.
This is a partial implementation of UBJSON with support for typed arrays. Some missing
features are `f64`, typed object, and the no-op.
2022-01-10 23:24:23 +08:00
Jiaming Yuan
001503186c
Rewrite approx ( #7214 )
...
This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing.
The rewrite has many benefits:
- Support for both `max_leaves` and `max_depth`.
- Support for `grow_policy`.
- Support for mono constraint.
- Support for feature weights.
- Support for easier bin configuration (`max_bin`).
- Support for categorical data.
- Faster performance for most of the datasets. (many times faster)
- Support for prediction cache.
- Significantly better performance for external memory.
- Unites the code base between approx and hist.
2022-01-10 21:15:05 +08:00
Jiaming Yuan
ed95e77752
[jvm-packages] Update JNI header. ( #7550 )
2022-01-10 14:59:40 +08:00
Jiaming Yuan
91c1a1c52f
Fix index type for bitfield. ( #7541 )
2022-01-05 19:23:29 +08:00
Jiaming Yuan
0df2ae63c7
Fix num_boosted_rounds for linear model. ( #7538 )
...
* Add note.
* Fix n boosted rounds.
2022-01-05 03:29:33 +08:00
Jiaming Yuan
28af6f9abb
Remove omp_get_max_threads in gbm and linear. ( #7537 )
...
* Use ctx in gbm.
* Use ctx threads in gbm and linear.
2022-01-05 03:28:52 +08:00
Jiaming Yuan
eea094e1bc
Remove some warnings from clang. ( #7533 )
...
* Unused variable.
* Unnecessary virtual function.
2022-01-05 03:28:21 +08:00
Jiaming Yuan
ec56d5869b
[doc] Include dask examples into doc. ( #7530 )
2022-01-05 03:27:22 +08:00
Jiaming Yuan
54582f641a
[doc] Use cross references in sphinx doc. ( #7522 )
...
* Use cross references instead of URL.
* Fix auto doc for callback.
2022-01-05 03:21:25 +08:00
Jiaming Yuan
eb1efb54b5
Define feature_names_in_. ( #7526 )
...
* Define `feature_names_in_`.
* Raise attribute error if it's not defined.
2022-01-05 01:35:34 +08:00
Jiaming Yuan
8f0a42a266
Initial support for multi-label classification. ( #7521 )
...
* Add support in sklearn classifier.
2022-01-04 23:58:21 +08:00
Jiaming Yuan
68cdbc9c16
Remove omp_get_max_threads in CPU predictor. ( #7519 )
...
This is part of the on going effort to remove the dependency on global omp variables.
2022-01-04 22:12:15 +08:00
Ikko Ashimine
5516281881
Fix typo in tree_model.cc ( #7539 )
...
occurance -> occurrence
2021-12-30 20:12:25 +08:00
Randall Britten
a4a0ebb85d
[doc] Lowercase omega for per tree complexity ( #7532 )
...
As suggested on issue #7480
2021-12-29 23:05:54 +08:00
Louis Desreumaux
3886c3dd8f
Remove macro definitions of snprintf and vsnprintf ( #7536 )
2021-12-26 08:05:59 +08:00
Ginko Balboa
29bfa94bb6
Fix external memory with gpu_hist and subsampling combination bug. ( #7481 )
...
Instead of accessing data from the `original_page_`, access the data from the first page of the available batch.
fix #7476
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2021-12-24 11:15:35 +08:00
Jiaming Yuan
7f399eac8b
Use double for GPU Hist node sum. ( #7507 )
2021-12-22 08:41:35 +08:00
Jiaming Yuan
eabec370e4
[R] Fix single sample prediction. ( #7524 )
2021-12-21 14:11:07 +08:00
Bobby Wang
e8c1eb99e4
[jvm-package] Clean up the legacy gpu support tests ( #7523 )
2021-12-21 09:15:51 +08:00
Xiaochang Wu
59bd1ab17e
Skip callback demo test if matplotlib is not installed ( #7520 )
2021-12-19 08:20:38 +08:00
Jiaming Yuan
58a6723eb1
Initial support for multioutput regression. ( #7514 )
...
* Add num target model parameter, which is configured from input labels.
* Change elementwise metric and indexing for weights.
* Add demo.
* Add tests.
2021-12-18 09:28:38 +08:00
Jiaming Yuan
9ab73f737e
Extract Sketch Entry from hist maker. ( #7503 )
...
* Extract Sketch Entry from hist maker.
* Add a new sketch container for sorted inputs.
* Optimize bin search.
2021-12-18 05:36:56 +08:00
Qingyun Wu
b4a1236cfc
[doc] Update the link to the tuning example in FLAML
2021-12-17 14:31:00 +08:00
Bobby Wang
24e25802a7
[jvm-packages] Add Rapids plugin support ( #7491 )
...
* Add GPU pre-processing pipeline.
2021-12-17 13:11:12 +08:00
Jiaming Yuan
5b1161bb64
Convert labels into tensor. ( #7456 )
...
* Add a new ctor to tensor for `initilizer_list`.
* Change labels from host device vector to tensor.
* Rename the field from `labels_` to `labels` since it's a public member.
2021-12-17 00:58:35 +08:00
Jiaming Yuan
6f8a4633b7
Fix Python typehint with upgraded mypy. ( #7513 )
2021-12-16 23:08:08 +08:00
Jiaming Yuan
70b12d898a
[dask] Fix ddqdm with empty partition. ( #7510 )
...
* Fix empty partition.
* war.
2021-12-16 20:37:29 +08:00
Jiaming Yuan
a512b4b394
[doc] Promote dask from experimental. [skip ci] ( #7509 )
2021-12-16 14:17:06 +08:00
Jiaming Yuan
05497a9141
[dask] Fix asyncio. ( #7508 )
2021-12-13 01:48:25 +08:00
Jiaming Yuan
01152f89ee
Remove unused parameters. ( #7499 )
2021-12-09 14:24:51 +08:00
Harvey
1864fab592
Minor edits to Parameters doc page. ( #7500 )
...
* bost -> both
* doc improvement
* use original filename
* syntax highlight false
* missed a few highlights
2021-12-07 15:46:44 +08:00
Jiaming Yuan
021f8bf28b
Fix pylint. ( #7498 )
2021-12-07 13:23:30 +08:00