5964 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
bc7a6ec603
Fix clang tidy (#8314)
* Fix clang-tidy

* Exempt clang-tidy from budget check

* Move clang-tidy
2022-10-06 05:16:06 -08:00
Dmitry Razdoburdin
c24e9d712c
Dispatcher for template parameters of BuildHist Kernels (#8259)
* Intoducing Column Wise Hist Building

* linting

* more linting

* bug fixing

* Removing column samping optimization for a while to simplify the review process.

* linting

* Removing unnecessary changes

* Use DispatchBinType in hist_util.cc

* Adding force_read_by column flag to buildhist. Adding tests for column wise buiilhist.

* Introducing new dispatcher for compile time flags in hist building

* fixing bug with using of DispatchBinType

* Fixing building

* Merging with master branch

Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-10-06 03:02:29 -08:00
Rong Ou
8d4038da57
Don't split input data in federated mode (#8279)
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 18:19:28 -08:00
Philip Hyunsu Cho
66fd9f5207
Update sponsors list [skip ci] (#8309) 2022-10-05 16:40:46 -08:00
Rory Mitchell
909e49e214
Reduce docker image size. (#8306) 2022-10-05 15:55:51 -08:00
Rong Ou
668b8a0ea4
[Breaking] Switch from rabit to the collective communicator (#8257)
* Switch from rabit to the collective communicator

* fix size_t specialization

* really fix size_t

* try again

* add include

* more include

* fix lint errors

* remove rabit includes

* fix pylint error

* return dict from communicator context

* fix communicator shutdown

* fix dask test

* reset communicator mocklist

* fix distributed tests

* do not save device communicator

* fix jvm gpu tests

* add python test for federated communicator

* Update gputreeshap submodule

Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 14:39:01 -08:00
Jiaming Yuan
e47b3a3da3
Upgrade mypy. (#8302)
Some breaking changes were made in mypy.
2022-10-05 14:31:59 +08:00
Jiaming Yuan
97c3a80a34
Add C document to sphinx, fix arrow. (#8300)
- Group C API.
- Add C API sphinx doc.
- Consistent use of `OptionalArg` and the parameter name `config`.
- Remove call to deprecated functions in demo.
- Fix some formatting errors.
- Add links to c examples in the document (only visible with doxygen pages)
- Fix arrow.
2022-10-05 09:52:15 +08:00
Philip Hyunsu Cho
b2bbf49015
Additional improvements to CI (#8303)
* Wait until budget check is complete

* Ensure that multi-GPU tests run for the master branch

* Fix
2022-10-04 03:03:38 -08:00
Rory Mitchell
d686bf52a6
Reduce time for some multi-gpu tests (#8288)
* Faster dask tests

* Reuse AllReducer objects in tests.

* Faster boost from prediction tests.

* Use rmm dask fixture.

* Speed up dask demo.

* mypy

* Format with black.

* mypy

* Clang-tidy

Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-04 02:49:33 -08:00
Philip Hyunsu Cho
ca0547bb65
[CI] Use RAPIDS 22.10 (#8298)
* [CI] Use RAPIDS 22.10

* Store CUDA and RAPIDS versions in one place

* Fix

* Add missing #include

* Update gputreeshap submodule

* Fix

* Remove outdated distributed tests
2022-10-03 23:18:07 -08:00
Philip Hyunsu Cho
37886a5dff
[CI] Document the use of Docker wrapper script (#8297)
* [CI] Document the use of Docker wrapper script

* Grammer fixes

* Document buildkite pipeline defs

* tests/buildkite/*.sh isn't meant to run locally
2022-10-02 12:45:00 -07:00
Philip Hyunsu Cho
9af99760d4
Various CI savings (#8291) 2022-09-30 05:42:56 -07:00
Jiaming Yuan
299e5000a4
Fix buildkite label. (#8287) 2022-09-29 17:33:19 -07:00
Jiaming Yuan
55cf24cc32
Obtain CSR matrix from DMatrix. (#8269) 2022-09-29 20:41:43 +08:00
Philip Hyunsu Cho
b14c44ee5e
[CI] Put Multi-GPU test suites in separate pipeline (#8286)
* [CI] Put Multi-GPU test suites in separate pipeline

* Avoid unset var error in Bash
2022-09-29 00:41:48 -08:00
Bobby Wang
cbf3a5f918
[pyspark][doc] add more doc for pyspark (#8271)
Co-authored-by: fis <jm.yuan@outlook.com>
2022-09-29 11:58:18 +08:00
Bobby Wang
c91fed083d
[pyspark] disable repartition_random_shuffle by default (#8283) 2022-09-29 10:50:51 +08:00
Jiaming Yuan
6925b222e0
Fix mixed types with cuDF. (#8280) 2022-09-29 00:57:52 +08:00
Jiaming Yuan
f835368bcf
Mark next release as 1.7 instead of 2.0 (#8281) 2022-09-28 14:33:37 +08:00
Jiaming Yuan
6d1452074a
Remove MGPU cpp tests. (#8276)
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-27 21:18:23 +08:00
Jiaming Yuan
fcab51aa82
Support more pandas nullable types (#8262)
- Float32/64
- Category.
2022-09-27 01:59:50 +08:00
Alex
1082ccd3cc
GitHub Workflows security hardening (#8267)
Signed-off-by: Alex <aleksandrosansan@gmail.com>
2022-09-27 00:54:27 +08:00
Rory Mitchell
8f77677193
Use quantised gradients in gpu_hist histograms (#8246) 2022-09-26 17:35:35 +02:00
Jiaming Yuan
4056974e37
Fix sparse threshold warning. (#8268) 2022-09-26 22:22:11 +08:00
WeichenXu
ff71c69adf
[pyspark] Add validation for param 'early_stopping_rounds' and 'validation_indicator_col' (#8250)
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2022-09-26 17:43:03 +08:00
Jiaming Yuan
0cd11b893a
[doc] Fix sphinx build. (#8270) 2022-09-26 12:33:31 +08:00
Joyce
be5b95e743
Enable OpenSSF Scorecard Github Action (#8263)
* chore: enable scorecard github action

Signed-off-by: Joyce Brum <joycebrumu.u@gmail.com>

* docs: add scorecard badge to the README file

Signed-off-by: Joyce Brum <joycebrumu.u@gmail.com>

Signed-off-by: Joyce Brum <joycebrumu.u@gmail.com>
2022-09-25 13:02:36 -07:00
Bobby Wang
8d247f0d64
[jvm-packages] fix spark-rapids compatibility issue (#8240)
* [jvm-packages] fix spark-rapids compatibility issue

spark-rapids (from 22.10) has shimmed GpuColumnVector, which means
we can't call it directly. So this PR call the UnshimmedGpuColumnVector
2022-09-22 23:31:29 +08:00
WeichenXu
ab342af242
[pyspark] Fix xgboost spark estimator dataset repartition issues (#8231) 2022-09-22 21:31:41 +08:00
Jiaming Yuan
3fd331f8f2
Add checks to C pointer arguments. (#8254) 2022-09-22 19:02:22 +08:00
Dmitry Razdoburdin
eb7bbee2c9
Optional by-column histogram build. (#8233)
Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com>
2022-09-22 05:16:13 +08:00
Jiaming Yuan
b791446623
Initial support for IPv6 (#8225)
- Merge rabit socket into XGBoost.
- Dask interface support.
- Add test to the socket.
2022-09-21 18:06:50 +08:00
Rong Ou
7d43e74e71
JNI wrapper for the collective communicator (#8242) 2022-09-21 04:20:25 +08:00
Jiaming Yuan
fffb1fca52
Calculate base_score based on input labels for mae. (#8107)
Fit an intercept as base score for abs loss.
2022-09-20 20:53:54 +08:00
Bobby Wang
4f42aa5f12
[pyspark] make the model saved by pyspark compatible (#8219)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:49 +08:00
Bobby Wang
520586ffa7
[pyspark] fix empty data issue when constructing DMatrix (#8245)
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-09-20 16:43:20 +08:00
Philip Hyunsu Cho
70df36c99c
[CI] Retire Jenkins server (#8243) 2022-09-14 08:46:23 -07:00
Jiaming Yuan
2e63af6117
Mitigate flaky data iter test. (#8244)
- Reduce the number of batches.
- Verify labels.
2022-09-14 17:54:14 +08:00
Jiaming Yuan
bdf265076d
Make QuantileDMatrix default to sklearn esitmators. (#8220) 2022-09-13 13:52:19 +08:00
Rong Ou
a2686543a9
Common interface for collective communication (#8057)
* implement broadcast for federated communicator

* implement allreduce

* add communicator factory

* add device adapter

* add device communicator to factory

* add rabit communicator

* add rabit communicator to the factory

* add nccl device communicator

* add synchronize to device communicator

* add back print and getprocessorname

* add python wrapper and c api

* clean up types

* fix non-gpu build

* try to fix ci

* fix std::size_t

* portable string compare ignore case

* c style size_t

* fix lint errors

* cross platform setenv

* fix memory leak

* fix lint errors

* address review feedback

* add python test for rabit communicator

* fix failing gtest

* use json to configure communicators

* fix lint error

* get rid of factories

* fix cpu build

* fix include

* fix python import

* don't export collective.py yet

* skip collective communicator pytest on windows

* add review feedback

* update documentation

* remove mpi communicator type

* fix tests

* shutdown the communicator separately

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-12 15:21:12 -07:00
Jiaming Yuan
bc818316f2
Prepare for improving Windows networking compatibility. (#8234)
* Prepare for improving Windows networking compatibility.

* Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which
  conflicts with winsock2.h
* Define `NOMINMAX` conditionally.
* Link the winsock library when mysys32 is used.
* Add config file for read the doc.
2022-09-10 15:16:49 +08:00
Jiaming Yuan
dd44ac91b8
[CI] Use binary R dependencies on Windows. (#8241) 2022-09-09 19:51:15 -07:00
Philip Hyunsu Cho
23faf656ad
[CI] Don't require manual approval for master branch (#8235) 2022-09-08 09:26:22 -08:00
Philip Hyunsu Cho
e888eb2fa9
[CI] Migrate CI pipelines from Jenkins to BuildKite (#8142)
* [CI] Migrate CI pipelines from Jenkins to BuildKite

* Require manual approval

* Less verbose output when pulling Docker

* Remove us-east-2 from metadata.py

* Add documentation

* Add missing underscore

* Add missing punctuation

* More specific instruction

* Better paragraph structure
2022-09-07 16:29:25 -08:00
Philip Hyunsu Cho
b397d64c96
Drop use of deleted virtual function to support older MacOS (#8226)
* Support older MacOS

* Update json.h
2022-09-07 11:25:59 -08:00
Rehan Guha
dc07137a2c
Updated dart.rst with correct links (#8229)
Updated the DART paper link as it was invalid and link was broken.
2022-09-08 00:57:09 +08:00
Jiaming Yuan
b5eb36f1af
Add max_cat_threshold to GPU and handle missing cat values. (#8212) 2022-09-07 00:57:51 +08:00
Jiaming Yuan
441ffc017a
Copy data from Ellpack to GHist. (#8215) 2022-09-06 23:05:49 +08:00
Bobby Wang
7ee10e3dbd
[pyspark] Cleanup the comments (#8217) 2022-09-05 16:20:12 +08:00