Rong Ou
668b8a0ea4
[Breaking] Switch from rabit to the collective communicator ( #8257 )
...
* Switch from rabit to the collective communicator
* fix size_t specialization
* really fix size_t
* try again
* add include
* more include
* fix lint errors
* remove rabit includes
* fix pylint error
* return dict from communicator context
* fix communicator shutdown
* fix dask test
* reset communicator mocklist
* fix distributed tests
* do not save device communicator
* fix jvm gpu tests
* add python test for federated communicator
* Update gputreeshap submodule
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 14:39:01 -08:00
Jiaming Yuan
e47b3a3da3
Upgrade mypy. ( #8302 )
...
Some breaking changes were made in mypy.
2022-10-05 14:31:59 +08:00
Philip Hyunsu Cho
b2bbf49015
Additional improvements to CI ( #8303 )
...
* Wait until budget check is complete
* Ensure that multi-GPU tests run for the master branch
* Fix
2022-10-04 03:03:38 -08:00
Rory Mitchell
d686bf52a6
Reduce time for some multi-gpu tests ( #8288 )
...
* Faster dask tests
* Reuse AllReducer objects in tests.
* Faster boost from prediction tests.
* Use rmm dask fixture.
* Speed up dask demo.
* mypy
* Format with black.
* mypy
* Clang-tidy
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-04 02:49:33 -08:00
Philip Hyunsu Cho
ca0547bb65
[CI] Use RAPIDS 22.10 ( #8298 )
...
* [CI] Use RAPIDS 22.10
* Store CUDA and RAPIDS versions in one place
* Fix
* Add missing #include
* Update gputreeshap submodule
* Fix
* Remove outdated distributed tests
2022-10-03 23:18:07 -08:00
Philip Hyunsu Cho
37886a5dff
[CI] Document the use of Docker wrapper script ( #8297 )
...
* [CI] Document the use of Docker wrapper script
* Grammer fixes
* Document buildkite pipeline defs
* tests/buildkite/*.sh isn't meant to run locally
2022-10-02 12:45:00 -07:00
Philip Hyunsu Cho
9af99760d4
Various CI savings ( #8291 )
2022-09-30 05:42:56 -07:00
Jiaming Yuan
299e5000a4
Fix buildkite label. ( #8287 )
2022-09-29 17:33:19 -07:00
Jiaming Yuan
55cf24cc32
Obtain CSR matrix from DMatrix. ( #8269 )
2022-09-29 20:41:43 +08:00
Philip Hyunsu Cho
b14c44ee5e
[CI] Put Multi-GPU test suites in separate pipeline ( #8286 )
...
* [CI] Put Multi-GPU test suites in separate pipeline
* Avoid unset var error in Bash
2022-09-29 00:41:48 -08:00
Jiaming Yuan
6925b222e0
Fix mixed types with cuDF. ( #8280 )
2022-09-29 00:57:52 +08:00
Jiaming Yuan
6d1452074a
Remove MGPU cpp tests. ( #8276 )
...
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-27 21:18:23 +08:00
Jiaming Yuan
fcab51aa82
Support more pandas nullable types ( #8262 )
...
- Float32/64
- Category.
2022-09-27 01:59:50 +08:00
Rory Mitchell
8f77677193
Use quantised gradients in gpu_hist histograms ( #8246 )
2022-09-26 17:35:35 +02:00
WeichenXu
ff71c69adf
[pyspark] Add validation for param 'early_stopping_rounds' and 'validation_indicator_col' ( #8250 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-09-26 17:43:03 +08:00
Jiaming Yuan
3fd331f8f2
Add checks to C pointer arguments. ( #8254 )
2022-09-22 19:02:22 +08:00
Dmitry Razdoburdin
eb7bbee2c9
Optional by-column histogram build. ( #8233 )
...
Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com>
2022-09-22 05:16:13 +08:00
Jiaming Yuan
b791446623
Initial support for IPv6 ( #8225 )
...
- Merge rabit socket into XGBoost.
- Dask interface support.
- Add test to the socket.
2022-09-21 18:06:50 +08:00
Jiaming Yuan
fffb1fca52
Calculate base_score based on input labels for mae. ( #8107 )
...
Fit an intercept as base score for abs loss.
2022-09-20 20:53:54 +08:00
Bobby Wang
4f42aa5f12
[pyspark] make the model saved by pyspark compatible ( #8219 )
...
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:49 +08:00
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix ( #8245 )
...
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
Philip Hyunsu Cho
70df36c99c
[CI] Retire Jenkins server ( #8243 )
2022-09-14 08:46:23 -07:00
Jiaming Yuan
2e63af6117
Mitigate flaky data iter test. ( #8244 )
...
- Reduce the number of batches.
- Verify labels.
2022-09-14 17:54:14 +08:00
Rong Ou
a2686543a9
Common interface for collective communication ( #8057 )
...
* implement broadcast for federated communicator
* implement allreduce
* add communicator factory
* add device adapter
* add device communicator to factory
* add rabit communicator
* add rabit communicator to the factory
* add nccl device communicator
* add synchronize to device communicator
* add back print and getprocessorname
* add python wrapper and c api
* clean up types
* fix non-gpu build
* try to fix ci
* fix std::size_t
* portable string compare ignore case
* c style size_t
* fix lint errors
* cross platform setenv
* fix memory leak
* fix lint errors
* address review feedback
* add python test for rabit communicator
* fix failing gtest
* use json to configure communicators
* fix lint error
* get rid of factories
* fix cpu build
* fix include
* fix python import
* don't export collective.py yet
* skip collective communicator pytest on windows
* add review feedback
* update documentation
* remove mpi communicator type
* fix tests
* shutdown the communicator separately
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-12 15:21:12 -07:00
Jiaming Yuan
bc818316f2
Prepare for improving Windows networking compatibility. ( #8234 )
...
* Prepare for improving Windows networking compatibility.
* Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which
conflicts with winsock2.h
* Define `NOMINMAX` conditionally.
* Link the winsock library when mysys32 is used.
* Add config file for read the doc.
2022-09-10 15:16:49 +08:00
Philip Hyunsu Cho
23faf656ad
[CI] Don't require manual approval for master branch ( #8235 )
2022-09-08 09:26:22 -08:00
Philip Hyunsu Cho
e888eb2fa9
[CI] Migrate CI pipelines from Jenkins to BuildKite ( #8142 )
...
* [CI] Migrate CI pipelines from Jenkins to BuildKite
* Require manual approval
* Less verbose output when pulling Docker
* Remove us-east-2 from metadata.py
* Add documentation
* Add missing underscore
* Add missing punctuation
* More specific instruction
* Better paragraph structure
2022-09-07 16:29:25 -08:00
Jiaming Yuan
b5eb36f1af
Add max_cat_threshold to GPU and handle missing cat values. ( #8212 )
2022-09-07 00:57:51 +08:00
Jiaming Yuan
441ffc017a
Copy data from Ellpack to GHist. ( #8215 )
2022-09-06 23:05:49 +08:00
WeichenXu
d03794ce7a
[pyspark] Add param validation for "objective" and "eval_metric" param, and remove invalid booster params ( #8173 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-24 15:29:43 +08:00
WeichenXu
f4628c22a4
[pyspark] Implement SparkXGBRanker estimator ( #8172 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-23 02:35:19 +08:00
Philip Hyunsu Cho
35ef8abc27
[CI] Prune unused archs from libnccl ( #8179 )
...
* [CI] Prune unused archs from libnccl
* Put pruning logic in CI directory
* Don't use --color in grep
2022-08-21 00:46:16 -08:00
Rong Ou
ad3bc0edee
Allow insecure gRPC connections for federated learning ( #8181 )
...
* Allow insecure gRPC connections for federated learning
* format
2022-08-19 12:16:14 +08:00
WeichenXu
53d2a733b0
[pyspark] Make Xgboost estimator support using sparse matrix as optimization ( #8145 )
...
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-08-19 01:57:28 +08:00
Jiaming Yuan
16bca5d4a1
Support CPU input for device QuantileDMatrix. ( #8136 )
...
- Copy `GHistIndexMatrix` to `Ellpack` when needed.
2022-08-11 21:21:26 +08:00
Jiaming Yuan
36e7c5364d
[dask] Deterministic rank assignment. ( #8018 )
2022-08-11 19:17:58 +08:00
Jiaming Yuan
d868126c39
[CI] Fix R build on Jenkins. ( #8154 )
2022-08-11 14:50:03 +08:00
Jiaming Yuan
570f8ae4ba
Use black on more Python files. ( #8137 )
2022-08-11 01:38:11 +08:00
Jiaming Yuan
446d536c23
Fix loading DMatrix binary in distributed env. ( #8149 )
...
- Try to load DMatrix binary before trying to parse text input.
- Remove some unmaintained code.
2022-08-10 22:53:16 +08:00
Jiaming Yuan
8fc60b31bc
Update PyPi wheel size limit. ( #8150 )
2022-08-10 18:49:57 +08:00
Jiaming Yuan
9ae547f994
Use config_context in sklearn interface. ( #8141 )
2022-08-09 14:48:54 +08:00
Bobby Wang
03cc3b359c
[pyspark] support a list of feature column names ( #8117 )
2022-08-08 17:05:27 +08:00
Jiaming Yuan
bcc8679a05
Update CUDA docker image and NCCL. ( #8139 )
2022-08-07 16:32:41 +08:00
Jiaming Yuan
d87f69215e
Quantile DMatrix for CPU. ( #8130 )
...
- Add a new `QuantileDMatrix` that works for both CPU and GPU.
- Deprecate `DeviceQuantileDMatrix`.
2022-08-02 15:51:23 +08:00
Jiaming Yuan
2cba1d9fcc
Fix compatibility with latest cupy. ( #8129 )
...
* Fix compatibility with latest cupy.
* Freeze mypy.
2022-08-01 15:24:42 +08:00
Jiaming Yuan
2c70751d1e
Implement iterative DMatrix for CPU. ( #8116 )
2022-07-26 22:34:21 +08:00
Jiaming Yuan
546de5efd2
[pyspark] Cleanup data processing. ( #8088 )
...
- Use numpy stack for handling list of arrays.
- Reuse concat function from dask.
- Prepare for `QuantileDMatrix`.
- Remove unused code.
- Use iterator for prediction to avoid initializing xgboost model
2022-07-26 15:00:52 +08:00
Jiaming Yuan
3970e4e6bb
Move pylint helper from dmlc-core. ( #8101 )
...
* Move pylint helper from dmlc-core.
- Move the helper into the XGBoost ci_build.
- Run it with multiprocessing.
* Fix original test.
2022-07-23 08:12:37 +08:00
Jiaming Yuan
7785d65c8a
Fix feature weights with multiple column sampling. ( #8100 )
2022-07-22 20:23:05 +08:00
Jiaming Yuan
4a4e5c7c18
Prepare gradient index for Quantile DMatrix. ( #8103 )
...
* Prepare gradient index for Quantile DMatrix.
- Implement push batch with adapter batch.
- Implement `GetFvalue` for prediction.
2022-07-22 17:26:33 +08:00