Jiaming Yuan
deb3edf562
Support list and tuple for QDM. ( #8542 )
2022-12-10 01:14:44 +08:00
Bobby Wang
40a1a2ffa8
[pyspark] check use_qdm across all the workers ( #8496 )
2022-12-08 18:09:17 +08:00
Gianfrancesco Angelini
5540019373
feat(py, plot_importance): + values_format as arg ( #8540 )
2022-12-08 00:47:28 +08:00
Matthew Rocklin
b7ffdcdbb9
Properly await async method client.wait_for_workers ( #8558 )
...
* Properly await async method client.wait_for_workers
* ignore mypy error.
Co-authored-by: jiamingy <jm.yuan@outlook.com>
2022-12-07 21:49:30 +08:00
Bobby Wang
8e41ad24f5
[pyspark] sort qid for SparkRanker ( #8497 )
...
* [pyspark] sort qid for SparkRandker
* resolve comments
2022-12-01 16:40:35 -08:00
Jiaming Yuan
d666ba775e
Support all pandas nullable integer types. ( #8480 )
...
- Enumerate all pandas integer types.
- Tests for `None`, `nan`, and `pd.NA`
2022-11-28 22:38:16 +08:00
Jiaming Yuan
f2209c1fe4
Don't shuffle columns in categorical tests. ( #8446 )
2022-11-28 20:28:06 +08:00
WeichenXu
67ea1c3435
[pyspark] Make QDM optional based on cuDF check ( #8471 )
2022-11-27 14:58:54 +08:00
Jiaming Yuan
8f97c92541
Support half type for pandas. ( #8481 )
2022-11-24 12:47:40 +08:00
Jiaming Yuan
e07245f110
Take datatable as row major input. ( #8472 )
...
* Take datatable as row major input.
Try to avoid a transform with dense table.
2022-11-24 09:20:13 +08:00
Jiaming Yuan
9dd8d70f0e
Fix mypy errors. ( #8444 )
2022-11-09 13:19:11 +08:00
Jiaming Yuan
0d3da9869c
Require isort on all Python files. ( #8420 )
2022-11-08 12:59:06 +08:00
James Lamb
bf8de227a9
[CI] remove unused import in python tests ( #8409 )
2022-11-03 22:27:25 +08:00
Rong Ou
99fa8dad2d
Add back xgboost.rabit for backwards compatibility ( #8408 )
...
* Add back xgboost.rabit for backwards compatibility
* fix my errors
* Fix lint
* Use FutureWarning
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-11-01 21:47:41 -07:00
Jiaming Yuan
a408c34558
Update JSON parser demo with categorical feature. ( #8401 )
...
- Parse categorical features in the Python example.
- Add tests.
- Update document.
2022-10-28 20:57:43 +08:00
Jiaming Yuan
cfd2a9f872
Extract dask and spark test into distributed test. ( #8395 )
...
- Move test files.
- Run spark and dask separately to prevent conflicts.
- Gather common code into the testing module.
2022-10-28 16:24:32 +08:00
Jiaming Yuan
f73520bfff
Bump development version to 2.0. ( #8390 )
2022-10-28 15:21:19 +08:00
Yizhi Liu
5699f60a88
Type fix for WebAssembly: use bst_ulong instead of size_t for ncol in CSR conversion. ( #8369 )
2022-10-26 19:21:45 +08:00
Jiaming Yuan
cf70864fa3
Move Python testing utilities into xgboost module. ( #8379 )
...
- Add typehints.
- Fixes for pylint.
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-26 16:56:11 +08:00
Jiaming Yuan
d0b99bdd95
[pyspark] Add type hint to basic utilities. ( #8375 )
2022-10-25 17:26:25 +08:00
Jiaming Yuan
c884b9e888
Validate features for inplace predict. ( #8359 )
2022-10-19 23:05:36 +08:00
luca-s
c47c71e34f
XGBRanker documentation: few clarifications ( #8356 )
2022-10-19 01:54:14 +08:00
Bobby Wang
76f95a6667
[pyspark] Filter out the unsupported train parameters ( #8355 )
2022-10-18 23:26:02 +08:00
Jiaming Yuan
3901f5d9db
[pyspark] Cleanup data processing. ( #8344 )
...
* Enable additional combinations of ctor parameters.
* Unify procedures for QuantileDMatrix and DMatrix.
2022-10-18 14:56:23 +08:00
luca-s
5647fc6542
XGBRanker documentation: missing default objective ( #8347 )
2022-10-18 10:43:29 +08:00
Rong Ou
8f3dee58be
Speed up tests with federated learning enabled ( #8350 )
...
* Speed up tests with federated learning enabled
* Re-enable timeouts
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-17 15:17:04 -07:00
Jiaming Yuan
2176e511fc
Disable pytest-timeout for now. ( #8348 )
2022-10-17 23:06:10 +08:00
Jiaming Yuan
fcddbc9264
FIx incorrect function name. ( #8346 )
2022-10-17 19:28:20 +08:00
Rong Ou
80e10e02ab
Avoid blank lines with federated training ( #8342 )
2022-10-14 14:55:01 +08:00
Jiaming Yuan
97a5b088a5
[pyspark] Use quantile dmatrix. ( #8284 )
2022-10-12 20:38:53 +08:00
Jiaming Yuan
c68684ff4c
Update parameter for categorical feature. ( #8285 )
2022-10-10 19:48:29 +08:00
Jiaming Yuan
5545c49cfc
Require keyword args for data iterator. ( #8327 )
2022-10-10 17:47:13 +08:00
Rong Ou
668b8a0ea4
[Breaking] Switch from rabit to the collective communicator ( #8257 )
...
* Switch from rabit to the collective communicator
* fix size_t specialization
* really fix size_t
* try again
* add include
* more include
* fix lint errors
* remove rabit includes
* fix pylint error
* return dict from communicator context
* fix communicator shutdown
* fix dask test
* reset communicator mocklist
* fix distributed tests
* do not save device communicator
* fix jvm gpu tests
* add python test for federated communicator
* Update gputreeshap submodule
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 14:39:01 -08:00
Jiaming Yuan
e47b3a3da3
Upgrade mypy. ( #8302 )
...
Some breaking changes were made in mypy.
2022-10-05 14:31:59 +08:00
Jiaming Yuan
97c3a80a34
Add C document to sphinx, fix arrow. ( #8300 )
...
- Group C API.
- Add C API sphinx doc.
- Consistent use of `OptionalArg` and the parameter name `config`.
- Remove call to deprecated functions in demo.
- Fix some formatting errors.
- Add links to c examples in the document (only visible with doxygen pages)
- Fix arrow.
2022-10-05 09:52:15 +08:00
Jiaming Yuan
55cf24cc32
Obtain CSR matrix from DMatrix. ( #8269 )
2022-09-29 20:41:43 +08:00
Bobby Wang
c91fed083d
[pyspark] disable repartition_random_shuffle by default ( #8283 )
2022-09-29 10:50:51 +08:00
Jiaming Yuan
6925b222e0
Fix mixed types with cuDF. ( #8280 )
2022-09-29 00:57:52 +08:00
Jiaming Yuan
f835368bcf
Mark next release as 1.7 instead of 2.0 ( #8281 )
2022-09-28 14:33:37 +08:00
Jiaming Yuan
fcab51aa82
Support more pandas nullable types ( #8262 )
...
- Float32/64
- Category.
2022-09-27 01:59:50 +08:00
WeichenXu
ff71c69adf
[pyspark] Add validation for param 'early_stopping_rounds' and 'validation_indicator_col' ( #8250 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-09-26 17:43:03 +08:00
WeichenXu
ab342af242
[pyspark] Fix xgboost spark estimator dataset repartition issues ( #8231 )
2022-09-22 21:31:41 +08:00
Jiaming Yuan
b791446623
Initial support for IPv6 ( #8225 )
...
- Merge rabit socket into XGBoost.
- Dask interface support.
- Add test to the socket.
2022-09-21 18:06:50 +08:00
Bobby Wang
4f42aa5f12
[pyspark] make the model saved by pyspark compatible ( #8219 )
...
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:49 +08:00
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix ( #8245 )
...
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
Jiaming Yuan
bdf265076d
Make QuantileDMatrix default to sklearn esitmators. ( #8220 )
2022-09-13 13:52:19 +08:00
Rong Ou
a2686543a9
Common interface for collective communication ( #8057 )
...
* implement broadcast for federated communicator
* implement allreduce
* add communicator factory
* add device adapter
* add device communicator to factory
* add rabit communicator
* add rabit communicator to the factory
* add nccl device communicator
* add synchronize to device communicator
* add back print and getprocessorname
* add python wrapper and c api
* clean up types
* fix non-gpu build
* try to fix ci
* fix std::size_t
* portable string compare ignore case
* c style size_t
* fix lint errors
* cross platform setenv
* fix memory leak
* fix lint errors
* address review feedback
* add python test for rabit communicator
* fix failing gtest
* use json to configure communicators
* fix lint error
* get rid of factories
* fix cpu build
* fix include
* fix python import
* don't export collective.py yet
* skip collective communicator pytest on windows
* add review feedback
* update documentation
* remove mpi communicator type
* fix tests
* shutdown the communicator separately
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-12 15:21:12 -07:00
Bobby Wang
7ee10e3dbd
[pyspark] Cleanup the comments ( #8217 )
2022-09-05 16:20:12 +08:00
Jiaming Yuan
ada4a86d1c
Fix dask interface with latest cupy. ( #8210 )
2022-09-03 03:10:43 +08:00
Rong Ou
b78bc734d9
Fix dask.py lint error ( #8216 )
2022-09-02 16:30:01 +08:00