Jiaming Yuan
8a16944664
Fix ranking with quantile dmatrix and group weight. ( #8762 )
2023-02-10 20:32:35 +08:00
Jiaming Yuan
199c421d60
Send default configuration from metric to objective. ( #8760 )
2023-02-09 20:18:07 +08:00
Jiaming Yuan
4ead65a28c
Increase timeout limit for linear. ( #8767 )
2023-02-09 18:20:12 +08:00
Jiaming Yuan
7b3d473593
[doc] Add demo for inference using individual tree. ( #8752 )
2023-02-07 04:40:18 +08:00
Jiaming Yuan
c1786849e3
Use array interface for CSC matrix. ( #8672 )
...
* Use array interface for CSC matrix.
Use array interface for CSC matrix and align the interface with CSR and dense.
- Fix nthread issue in the R package DMatrix.
- Unify the behavior of handling `missing` with other inputs.
- Unify the behavior of handling `missing` around R, Python, Java, and Scala DMatrix.
- Expose `num_non_missing` to the JVM interface.
- Deprecate old CSR and CSC constructors.
2023-02-05 01:59:46 +08:00
BenEfrati
213b5602d9
Add sample_weight to eval_metric ( #8706 )
2023-02-05 00:06:38 +08:00
Jiaming Yuan
0e61ba57d6
Fix GPU L1 error. ( #8749 )
2023-02-04 03:02:00 +08:00
Jiaming Yuan
1325ba9251
Support primitive types of pyarrow-backed pandas dataframe. ( #8653 )
...
Categorical data (dictionary) is not supported at the moment.
2023-01-30 17:53:29 +08:00
James Lamb
96e6b6beba
[ci] remove unused imports in tests ( #8707 )
2023-01-25 14:10:29 +08:00
Jiaming Yuan
31b9cbab3d
Make sure input numpy array is aligned. ( #8690 )
...
- use `np.require` to specify that the alignment is required.
- scipy csr as well.
- validate input pointer in `ArrayInterface`.
2023-01-18 08:12:13 +08:00
Jiaming Yuan
247946a875
Cache transformed in QuantileDMatrix for efficiency. ( #8666 )
2023-01-17 06:02:40 +08:00
Jiaming Yuan
d6018eb4b9
Remove all use of DeviceQuantileDMatrix. ( #8665 )
2023-01-17 00:04:10 +08:00
Jiaming Yuan
badeff1d74
Init estimation for regression. ( #8272 )
2023-01-11 02:04:56 +08:00
Jiaming Yuan
1b58d81315
[doc] Document Python inputs. ( #8643 )
2023-01-10 15:39:32 +08:00
Jiaming Yuan
e68a152d9e
Do not return internal value for get_params. ( #8634 )
2023-01-05 17:48:26 +08:00
Jiaming Yuan
6eaddaa9c3
[CI] Fix CI with updated dependencies. ( #8631 )
...
* [CI] Fix CI with updated dependencies.
- Fix jvm package get iris.
* Skip SHAP test for now.
* Revert "Skip SHAP test for now."
This reverts commit 9aa28b4d8aee53fa95d92d2a879c6783ff4b2faa.
* Catch all exceptions.
2023-01-03 21:04:04 -08:00
Jiaming Yuan
f6effa1734
Support Series and Python primitives in inplace_predict and QDM ( #8547 )
2022-12-17 00:15:15 +08:00
Rong Ou
42e6fbb0db
Fix sklearn test that calls a removed field ( #8579 )
2022-12-09 13:06:44 -08:00
Jiaming Yuan
deb3edf562
Support list and tuple for QDM. ( #8542 )
2022-12-10 01:14:44 +08:00
Jiaming Yuan
d666ba775e
Support all pandas nullable integer types. ( #8480 )
...
- Enumerate all pandas integer types.
- Tests for `None`, `nan`, and `pd.NA`
2022-11-28 22:38:16 +08:00
Jiaming Yuan
f2209c1fe4
Don't shuffle columns in categorical tests. ( #8446 )
2022-11-28 20:28:06 +08:00
Jiaming Yuan
8f97c92541
Support half type for pandas. ( #8481 )
2022-11-24 12:47:40 +08:00
Jiaming Yuan
e07245f110
Take datatable as row major input. ( #8472 )
...
* Take datatable as row major input.
Try to avoid a transform with dense table.
2022-11-24 09:20:13 +08:00
Jiaming Yuan
0d3da9869c
Require isort on all Python files. ( #8420 )
2022-11-08 12:59:06 +08:00
Rong Ou
99fa8dad2d
Add back xgboost.rabit for backwards compatibility ( #8408 )
...
* Add back xgboost.rabit for backwards compatibility
* fix my errors
* Fix lint
* Use FutureWarning
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-11-01 21:47:41 -07:00
Jiaming Yuan
a408c34558
Update JSON parser demo with categorical feature. ( #8401 )
...
- Parse categorical features in the Python example.
- Add tests.
- Update document.
2022-10-28 20:57:43 +08:00
Jiaming Yuan
cfd2a9f872
Extract dask and spark test into distributed test. ( #8395 )
...
- Move test files.
- Run spark and dask separately to prevent conflicts.
- Gather common code into the testing module.
2022-10-28 16:24:32 +08:00
Jiaming Yuan
cf70864fa3
Move Python testing utilities into xgboost module. ( #8379 )
...
- Add typehints.
- Fixes for pylint.
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-26 16:56:11 +08:00
Jiaming Yuan
c884b9e888
Validate features for inplace predict. ( #8359 )
2022-10-19 23:05:36 +08:00
Bobby Wang
76f95a6667
[pyspark] Filter out the unsupported train parameters ( #8355 )
2022-10-18 23:26:02 +08:00
Jiaming Yuan
3901f5d9db
[pyspark] Cleanup data processing. ( #8344 )
...
* Enable additional combinations of ctor parameters.
* Unify procedures for QuantileDMatrix and DMatrix.
2022-10-18 14:56:23 +08:00
Jiaming Yuan
2176e511fc
Disable pytest-timeout for now. ( #8348 )
2022-10-17 23:06:10 +08:00
Jiaming Yuan
97a5b088a5
[pyspark] Use quantile dmatrix. ( #8284 )
2022-10-12 20:38:53 +08:00
Rory Mitchell
ce0382dcb0
[CI] Refactor tests to reduce CI time. ( #8312 )
2022-10-12 11:32:06 +02:00
Jiaming Yuan
5545c49cfc
Require keyword args for data iterator. ( #8327 )
2022-10-10 17:47:13 +08:00
Rong Ou
668b8a0ea4
[Breaking] Switch from rabit to the collective communicator ( #8257 )
...
* Switch from rabit to the collective communicator
* fix size_t specialization
* really fix size_t
* try again
* add include
* more include
* fix lint errors
* remove rabit includes
* fix pylint error
* return dict from communicator context
* fix communicator shutdown
* fix dask test
* reset communicator mocklist
* fix distributed tests
* do not save device communicator
* fix jvm gpu tests
* add python test for federated communicator
* Update gputreeshap submodule
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 14:39:01 -08:00
Jiaming Yuan
55cf24cc32
Obtain CSR matrix from DMatrix. ( #8269 )
2022-09-29 20:41:43 +08:00
Jiaming Yuan
fcab51aa82
Support more pandas nullable types ( #8262 )
...
- Float32/64
- Category.
2022-09-27 01:59:50 +08:00
WeichenXu
ff71c69adf
[pyspark] Add validation for param 'early_stopping_rounds' and 'validation_indicator_col' ( #8250 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-09-26 17:43:03 +08:00
Jiaming Yuan
b791446623
Initial support for IPv6 ( #8225 )
...
- Merge rabit socket into XGBoost.
- Dask interface support.
- Add test to the socket.
2022-09-21 18:06:50 +08:00
Jiaming Yuan
fffb1fca52
Calculate base_score based on input labels for mae. ( #8107 )
...
Fit an intercept as base score for abs loss.
2022-09-20 20:53:54 +08:00
Bobby Wang
4f42aa5f12
[pyspark] make the model saved by pyspark compatible ( #8219 )
...
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:49 +08:00
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix ( #8245 )
...
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
Jiaming Yuan
2e63af6117
Mitigate flaky data iter test. ( #8244 )
...
- Reduce the number of batches.
- Verify labels.
2022-09-14 17:54:14 +08:00
Rong Ou
a2686543a9
Common interface for collective communication ( #8057 )
...
* implement broadcast for federated communicator
* implement allreduce
* add communicator factory
* add device adapter
* add device communicator to factory
* add rabit communicator
* add rabit communicator to the factory
* add nccl device communicator
* add synchronize to device communicator
* add back print and getprocessorname
* add python wrapper and c api
* clean up types
* fix non-gpu build
* try to fix ci
* fix std::size_t
* portable string compare ignore case
* c style size_t
* fix lint errors
* cross platform setenv
* fix memory leak
* fix lint errors
* address review feedback
* add python test for rabit communicator
* fix failing gtest
* use json to configure communicators
* fix lint error
* get rid of factories
* fix cpu build
* fix include
* fix python import
* don't export collective.py yet
* skip collective communicator pytest on windows
* add review feedback
* update documentation
* remove mpi communicator type
* fix tests
* shutdown the communicator separately
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-12 15:21:12 -07:00
Jiaming Yuan
b5eb36f1af
Add max_cat_threshold to GPU and handle missing cat values. ( #8212 )
2022-09-07 00:57:51 +08:00
WeichenXu
d03794ce7a
[pyspark] Add param validation for "objective" and "eval_metric" param, and remove invalid booster params ( #8173 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-24 15:29:43 +08:00
WeichenXu
f4628c22a4
[pyspark] Implement SparkXGBRanker estimator ( #8172 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-23 02:35:19 +08:00
WeichenXu
53d2a733b0
[pyspark] Make Xgboost estimator support using sparse matrix as optimization ( #8145 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-19 01:57:28 +08:00
Jiaming Yuan
36e7c5364d
[dask] Deterministic rank assignment. ( #8018 )
2022-08-11 19:17:58 +08:00