Philip Hyunsu Cho
f6e6d0b2c0
[CI] Build Python wheels for MacOS (x86_64 and arm64) ( #7621 )
...
* Build Python wheels for OSX (x86_64 and arm64)
* Use Conda's libomp when running Python tests
* fix
* Add comment to explain CIBW_TARGET_OSX_ARM64
* Update release script
* Add comments in build_python_wheels.sh
* Document wheel pipeline
2022-02-02 17:35:48 -08:00
Philip Hyunsu Cho
271a7c5d43
[Doc] fix typo in install doc ( #7623 )
2022-01-31 13:35:56 -08:00
Philip Hyunsu Cho
c621775f34
Replace all uses of deprecated function sklearn.datasets.load_boston ( #7373 )
...
* Replace all uses of deprecated function sklearn.datasets.load_boston
* More renaming
* Fix bad name
* Update assertion
* Fix n boosted rounds.
* Avoid over regularization.
* Rebase.
* Avoid over regularization.
* Whac-a-mole
Co-authored-by: fis <jm.yuan@outlook.com>
2022-01-30 04:27:57 -08:00
Philip Hyunsu Cho
b4340abf56
Add special handling for multi:softmax in sklearn predict ( #7607 )
...
* Add special handling for multi:softmax in sklearn predict
* Add test coverage
2022-01-29 15:54:49 -08:00
david-cortes
7f738e7f6f
[R] Accept CSR data for predictions ( #7615 )
2022-01-30 00:54:57 +08:00
Michael Chirico
549bd419bb
use exit hook to remove temp file ( #7611 )
...
This guarantees the removal will trigger for unexpected early exits
2022-01-29 16:06:52 +08:00
Philip Hyunsu Cho
f21301c749
[Doc] Add instruction to install XGBoost for Apple Silicon using Conda ( #7612 )
2022-01-28 01:06:39 -08:00
Jiaming Yuan
81210420c6
Remove omp_get_max_threads ( #7608 )
...
This is the one last PR for removing omp global variable.
* Add context object to the `DMatrix`. This bridges `DMatrix` with https://github.com/dmlc/xgboost/issues/7308 .
* Require context to be available at the construction time of booster.
* Add `n_threads` support for R csc DMatrix constructor.
* Remove `omp_get_max_threads` in R glue code.
* Remove threading utilities that rely on omp global variable.
2022-01-28 16:09:22 +08:00
Philip Hyunsu Cho
028bdc1740
[R] Fix typo in docstring ( #7606 )
2022-01-26 23:33:25 +08:00
Jiaming Yuan
e060519d4f
Avoid regenerating the gradient index for approx. ( #7591 )
2022-01-26 21:41:30 +08:00
Jiaming Yuan
5d7818e75d
Remove omp_get_max_threads in tree updaters. ( #7590 )
2022-01-26 19:55:47 +08:00
Jiaming Yuan
24789429fd
Support latest pandas Index type. ( #7595 )
2022-01-26 18:20:10 +08:00
AJ Schmidt
511805c981
Compress fatbins ( #7601 )
...
* compress CUDA device code
Co-authored-by: ptaylor <paul.e.taylor@me.com>
2022-01-25 18:30:59 +08:00
Jiaming Yuan
6967ef7267
Remove omp_get_max_threads in objective. ( #7589 )
2022-01-24 04:35:49 +08:00
Jiaming Yuan
5817840858
Remove omp_get_max_threads in data. ( #7588 )
2022-01-24 02:44:07 +08:00
Jiaming Yuan
f84291c1e1
Fix max_cat_to_onehot doc annotation [skip ci] ( #7592 )
2022-01-23 16:33:23 +08:00
Jiaming Yuan
d262503781
[R] Implement new save raw in R. ( #7571 )
2022-01-22 20:55:47 +08:00
Jiaming Yuan
ef4dae4c0e
[dask] Add scheduler address to dask config. ( #7581 )
...
- Add user configuration.
- Bring back to the logic of using scheduler address from dask. This was removed when we were trying to support GKE, now we bring it back and let xgboost try it if direct guess or host IP from user config failed.
2022-01-22 01:56:32 +08:00
Jiaming Yuan
5ddd4a9d06
Small cleanup to tests. ( #7585 )
...
* Use random port in dask tests to avoid warnings for occupied port.
* Increase the difficulty of AUC tests.
2022-01-21 06:26:57 +00:00
Philip Hyunsu Cho
9fd510faa5
[CI] Clarify steps for publishing artifacts to Maven Central ( #7582 )
2022-01-20 14:23:07 -08:00
Jiaming Yuan
529cf8a54a
Configure cub version automatically. ( #7579 )
...
Note that when cub inside CUDA is being used, XGBoost performs checks on input size
instead of using internal cub function to accept inputs larger than maximum integer.
2022-01-20 19:49:26 +08:00
Jiaming Yuan
ac7a36367c
[jvm-packages] Implement new save_raw in jvm-packages. ( #7570 )
...
* New `toByteArray` that accepts a parameter for format.
2022-01-19 16:00:14 +08:00
Jiaming Yuan
b4ec1682c6
Update document for multi output and categorical. ( #7574 )
...
* Group together categorical related parameters.
* Update documents about multioutput and categorical.
2022-01-19 04:35:17 +08:00
Jiaming Yuan
dac9eb13bd
Implement new save_raw in Python. ( #7572 )
...
* Expose the new C API function to Python.
* Remove old document and helper script.
* Small optimization to the `save_raw` and Json ctors.
2022-01-19 02:27:51 +08:00
Jiaming Yuan
9f20a3315e
Test with latest numpy. ( #7573 )
2022-01-19 00:46:23 +08:00
Jiaming Yuan
bb56bb9a13
Fix merge conflict. ( #7577 )
2022-01-18 23:01:34 +08:00
Jiaming Yuan
cc06fab9a7
Support distributed CPU env for categorical data. ( #7575 )
...
* Add support for cat data in sketch allreduce.
* Share tests between CPU and GPU.
2022-01-18 21:56:07 +08:00
Jiaming Yuan
deab0e32ba
Validate out of range categorical value. ( #7576 )
...
* Use float in CPU categorical set to preserve the input value.
* Check out of range values.
2022-01-18 20:16:19 +08:00
Jiaming Yuan
d6ea5cc1ed
Cover approx tree method for categorical data tests. ( #7569 )
...
* Add tree to df tests.
* Add plotting tests.
* Add histogram tests.
2022-01-16 11:31:40 +08:00
Jiaming Yuan
465dc63833
Fix tree param feature type. ( #7565 )
2022-01-16 04:46:29 +08:00
Jiaming Yuan
a1bcd33a3b
[breaking] Change internal model serialization to UBJSON. ( #7556 )
...
* Use typed array for models.
* Change the memory snapshot format.
* Add new C API for saving to raw format.
2022-01-16 02:11:53 +08:00
Jiaming Yuan
13b0fa4b97
Implement get_group. ( #7564 )
2022-01-16 02:07:42 +08:00
Jiaming Yuan
52277cc3da
Rename build info function to be consistent with rest of the API. ( #7553 )
2022-01-14 00:39:28 +08:00
Jiaming Yuan
e94b766310
Fix early stopping with linear model. ( #7554 )
2022-01-13 21:53:06 +08:00
Jiaming Yuan
e5e47c3c99
Clarify the behavior of invalid categorical value handling. ( #7529 )
2022-01-13 16:11:52 +08:00
Philip Hyunsu Cho
20c0d60ac7
Restore functionality of max_depth=0 in hist ( #7551 )
...
* Restore functionality of max_depth=0 in hist
* Add test case
2022-01-11 01:37:44 +08:00
Jiaming Yuan
2db808021d
Silent some warnings for unused variable. ( #7548 )
2022-01-11 01:16:26 +08:00
Jiaming Yuan
c635d4c46a
Implement ubjson. ( #7549 )
...
* Implement ubjson.
This is a partial implementation of UBJSON with support for typed arrays. Some missing
features are `f64`, typed object, and the no-op.
2022-01-10 23:24:23 +08:00
Jiaming Yuan
001503186c
Rewrite approx ( #7214 )
...
This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing.
The rewrite has many benefits:
- Support for both `max_leaves` and `max_depth`.
- Support for `grow_policy`.
- Support for mono constraint.
- Support for feature weights.
- Support for easier bin configuration (`max_bin`).
- Support for categorical data.
- Faster performance for most of the datasets. (many times faster)
- Support for prediction cache.
- Significantly better performance for external memory.
- Unites the code base between approx and hist.
2022-01-10 21:15:05 +08:00
Jiaming Yuan
ed95e77752
[jvm-packages] Update JNI header. ( #7550 )
2022-01-10 14:59:40 +08:00
Jiaming Yuan
91c1a1c52f
Fix index type for bitfield. ( #7541 )
2022-01-05 19:23:29 +08:00
Jiaming Yuan
0df2ae63c7
Fix num_boosted_rounds for linear model. ( #7538 )
...
* Add note.
* Fix n boosted rounds.
2022-01-05 03:29:33 +08:00
Jiaming Yuan
28af6f9abb
Remove omp_get_max_threads in gbm and linear. ( #7537 )
...
* Use ctx in gbm.
* Use ctx threads in gbm and linear.
2022-01-05 03:28:52 +08:00
Jiaming Yuan
eea094e1bc
Remove some warnings from clang. ( #7533 )
...
* Unused variable.
* Unnecessary virtual function.
2022-01-05 03:28:21 +08:00
Jiaming Yuan
ec56d5869b
[doc] Include dask examples into doc. ( #7530 )
2022-01-05 03:27:22 +08:00
Jiaming Yuan
54582f641a
[doc] Use cross references in sphinx doc. ( #7522 )
...
* Use cross references instead of URL.
* Fix auto doc for callback.
2022-01-05 03:21:25 +08:00
Jiaming Yuan
eb1efb54b5
Define feature_names_in_. ( #7526 )
...
* Define `feature_names_in_`.
* Raise attribute error if it's not defined.
2022-01-05 01:35:34 +08:00
Jiaming Yuan
8f0a42a266
Initial support for multi-label classification. ( #7521 )
...
* Add support in sklearn classifier.
2022-01-04 23:58:21 +08:00
Jiaming Yuan
68cdbc9c16
Remove omp_get_max_threads in CPU predictor. ( #7519 )
...
This is part of the on going effort to remove the dependency on global omp variables.
2022-01-04 22:12:15 +08:00
Ikko Ashimine
5516281881
Fix typo in tree_model.cc ( #7539 )
...
occurance -> occurrence
2021-12-30 20:12:25 +08:00