Jiaming Yuan
ccfc90e4c6
[rabit] Improved connection handling. ( #9531 )
...
- Enable timeout.
- Report connection error from the system.
- Handle retry for both tracker connection and peer connection.
2023-08-30 13:00:04 +08:00
Jiaming Yuan
90ef250ea1
[rabit] Drop support for MPI backend. ( #9525 )
...
- Add checks in cmake.
- Remove mpi related code.
2023-08-28 21:01:22 +08:00
Jiaming Yuan
1b87a1d8f8
[rabit] Small cleanup to tracker initialization. ( #9524 )
...
- Remove recover related code.
- Clean startup, no need to consider previously connected nodes.
2023-08-27 05:10:59 +08:00
Jiaming Yuan
bc267dd729
Use ptr from mmap for GHistIndexMatrix and ColumnMatrix. ( #9315 )
...
* Use ptr from mmap for `GHistIndexMatrix` and `ColumnMatrix`.
- Define a resource for holding various types of memory pointers.
- Define ref vector for holding resources.
- Swap the underlying resources for GHist and ColumnM.
- Add documentation for current status.
- s390x support is removed. It should work if you can compile XGBoost, all the old workaround code does is to get GCC to compile.
2023-06-27 19:05:46 +08:00
Jiaming Yuan
ee6809e642
Use mmap for external memory. ( #9282 )
...
- Have basic infrastructure for mmap.
- Release file write handle.
2023-06-19 18:52:55 +08:00
Jiaming Yuan
fc8110ef79
Remove document and demo in RABIT. ( #9246 )
2023-06-06 08:20:10 +08:00
Jiaming Yuan
47b3cb6fb7
Remove unused parameters in RABIT. ( #9108 )
2023-05-05 05:26:24 +08:00
Philip Hyunsu Cho
6d8afb2218
[CI] Require C++17 + CMake 3.18; Use CUDA 11.8 in CI ( #8853 )
...
* Update to C++17
* Turn off unity build
* Update CMake to 3.18
* Use MSVC 2022 + CUDA 11.8
* Re-create stack for worker images
* Allocate more disk space for Windows
* Tempiorarily disable clang-tidy
* RAPIDS now requires Python 3.10+
* Unpin cuda-python
* Use latest NCCL
* Use Ubuntu 20.04 in RMM image
* Mark failing mgpu test as xfail
2023-03-01 09:22:24 -08:00
Rong Ou
77b069c25d
Support bitwise allreduce operations in the communicator ( #8623 )
2022-12-25 06:40:05 +08:00
Christian Clauss
ae27e228c4
xrange() was removed in Python 3 in favor or range() ( #8371 )
2022-10-27 16:36:14 +08:00
Rong Ou
668b8a0ea4
[Breaking] Switch from rabit to the collective communicator ( #8257 )
...
* Switch from rabit to the collective communicator
* fix size_t specialization
* really fix size_t
* try again
* add include
* more include
* fix lint errors
* remove rabit includes
* fix pylint error
* return dict from communicator context
* fix communicator shutdown
* fix dask test
* reset communicator mocklist
* fix distributed tests
* do not save device communicator
* fix jvm gpu tests
* add python test for federated communicator
* Update gputreeshap submodule
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2022-10-05 14:39:01 -08:00
Jiaming Yuan
b791446623
Initial support for IPv6 ( #8225 )
...
- Merge rabit socket into XGBoost.
- Dask interface support.
- Add test to the socket.
2022-09-21 18:06:50 +08:00
Jiaming Yuan
bc818316f2
Prepare for improving Windows networking compatibility. ( #8234 )
...
* Prepare for improving Windows networking compatibility.
* Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which
conflicts with winsock2.h
* Define `NOMINMAX` conditionally.
* Link the winsock library when mysys32 is used.
* Add config file for read the doc.
2022-09-10 15:16:49 +08:00
Jiaming Yuan
142a208a90
Fix compiler warnings. ( #8022 )
...
- Remove/fix unused parameters
- Remove deprecated code in rabit.
- Update dmlc-core.
2022-06-22 21:29:10 +08:00
Rong Ou
e5ec546da5
[Breaking] Remove rabit support for custom reductions and grow_local_histmaker updater ( #7992 )
2022-06-21 15:08:23 +08:00
Rong Ou
14ef38b834
Initial support for federated learning ( #7831 )
...
Federated learning plugin for xgboost:
* A gRPC server to aggregate MPI-style requests (allgather, allreduce, broadcast) from federated workers.
* A Rabit engine for the federated environment.
* Integration test to simulate federated learning.
Additional followups are needed to address GPU support, better security, and privacy, etc.
2022-05-05 21:49:22 +08:00
Louis Desreumaux
3886c3dd8f
Remove macro definitions of snprintf and vsnprintf ( #7536 )
2021-12-26 08:05:59 +08:00
Jiaming Yuan
345796825f
Optional find dependency in installed cmake config. ( #7099 )
...
* Find dependency only when xgboost is built as static library.
* Resolve msvc warning.
* Add test for linking shared library.
2021-07-11 17:20:55 +08:00
Andrew Ziem
3e7e426b36
Fix spelling in documents ( #6948 )
...
* Update roxygen2 doc.
Co-authored-by: fis <jm.yuan@outlook.com>
2021-05-11 20:44:36 +08:00
Jiaming Yuan
8747885a8b
Support Solaris. ( #6578 )
...
* Add system header.
* Remove use of TR1 on Solaris
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2021-01-07 09:05:05 +08:00
Jiaming Yuan
42d31d9dcb
Fix MPI build. ( #6403 )
2020-11-21 13:38:21 +08:00
Jiaming Yuan
debeae2509
[R] Fix warnings from R check --as-cran ( #6374 )
...
* Remove exit and printf.
* Fix warnings.
2020-11-11 18:39:37 +08:00
Sergio Gavilán
b181a88f9f
Reduced some C++ compiler warnings ( #6197 )
...
* Removed some warnings
* Rebase with master
* Solved C++ Google Tests errors made by refactoring in order to remove warnings
* Undo renaming path -> path_
* Fix style check
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-29 12:36:00 -07:00
Jiaming Yuan
b180223d18
Cleanup RABIT. ( #6290 )
...
* Remove recovery and MPI speed tests.
* Remove readme.
* Remove Python binding.
* Add checks in C API.
2020-10-27 08:48:22 +08:00
Jiaming Yuan
d61b628bf5
Remove RABIT CMake targets. ( #6275 )
...
* Now it's built as part of libxgboost.
* Set correct C API error in RABIT initialization and finalization.
* Remove redundant message.
* Guard the tracker print C API.
2020-10-27 01:30:20 +08:00
Jiaming Yuan
b5c2a47b20
Drop single point model recovery ( #6262 )
...
* Pass rabit params in JVM package.
* Implement timeout using poll timeout parameter.
* Remove OOB data check.
2020-10-21 15:27:03 +08:00
Jiaming Yuan
07945290a2
Remove unused RABIT targets. ( #6110 )
...
* Remove rabit mock.
* Remove rabit base.
2020-09-11 14:09:44 +08:00
Jiaming Yuan
c92d751ad1
Enable building rabit on Windows ( #6105 )
2020-09-11 11:54:46 +08:00
Jiaming Yuan
3dcd85fab5
Refactor rabit tests ( #6096 )
...
* Merge rabit tests into XGBoost.
* Run them On CI.
* Simplification for CMake scripts.
2020-09-09 12:30:29 +08:00
Jiaming Yuan
b0001a6e29
Correct style warnings from clang-tidy for rabit. ( #6095 )
2020-09-08 12:13:58 +08:00
fis
111968ca58
Merge rabit
2020-08-18 03:52:33 +08:00
fis
1c5904df3f
Remove rabit.
2020-08-18 03:48:36 +08:00
Jiaming Yuan
f93f1c03fc
Rabit update. ( #5978 )
...
* Remove parameter on JVM Packages.
2020-08-11 09:17:32 +08:00
Philip Hyunsu Cho
23e2c6ec91
Upgrade Rabit ( #5876 )
2020-07-09 16:18:33 -07:00
Jiaming Yuan
26143ad0b1
Update rabit. ( #5680 )
2020-06-22 14:32:43 +08:00
Philip Hyunsu Cho
ef26bc45bf
Hide C++ symbols in libxgboost.so when building Python wheel ( #5590 )
...
* Hide C++ symbols in libxgboost.so when building Python wheel
* Update Jenkinsfile
* Add test
* Upgrade rabit
* Add setup.py option.
Co-authored-by: fis <jm.yuan@outlook.com>
2020-04-24 13:32:05 -07:00
Philip Hyunsu Cho
f7105fa44f
Update Rabit ( #5237 )
2020-01-28 02:05:01 -08:00
Chen Qin
b29b8c2f34
[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests ( #4966 )
...
* [phase 1] expose sets of rabit configurations to spark layer
* add back mutable import
* disable ring_mincount till https://github.com/dmlc/rabit/pull/106d
* Revert "disable ring_mincount till https://github.com/dmlc/rabit/pull/106d "
This reverts commit 65e95a98e24f5eb53c6ba9ef9b2379524258984d.
* apply latest rabit
* fix build error
* apply https://github.com/dmlc/xgboost/pull/4880
* downgrade cmake in rabit
* point to rabit with DMLC_ROOT fix
* relative path of rabit install prefix
* split rabit parameters to another trait
* misc
* misc
* Delete .classpath
* Delete .classpath
* Delete .classpath
* Update XGBoostClassifier.scala
* Update XGBoostRegressor.scala
* Update GeneralParams.scala
* Update GeneralParams.scala
* Update GeneralParams.scala
* Update GeneralParams.scala
* Delete .classpath
* Update RabitParams.scala
* Update .gitignore
* Update .gitignore
* apply rabitParams to training
* use string as rabit parameter value type
* cleanup
* add rabitEnv check
* point to dmlc/rabit
* per feedback
* update private scope
* misc
* update rabit
* add rabit_timtout, fix failing test.
* split tests
* allow build jvm with rabit mock
* pass mock failures to rabit with test
* add mock error and graceful handle rabit assertion error test
* split mvn test
* remove sign for test
* update rabit
* build jvm_packages with rabit mock
* point back to dmlc/rabit
* per feedback, update scala header
* cleanup pom
* per feedback
* try fix lint
* fix lint
* per feedback, remove bootstrap_cache
* per feedback 2
* try replace dev profile with passing mvn property
* fix build error
* remove mvn property and replace with env setting to build test jar
* per feedback
* revert copyright headlines, point to dmlc/rabit
* revert python lint
* remove multiple failure test case as retry is not enabled in spark
* Update core.py
* Update core.py
* per feedback, style fix
2019-11-01 14:21:19 -07:00
Jiaming Yuan
010b8f1428
Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests ( #4876 )" ( #4965 )
...
This reverts commit 86ed01c4bbecef66e1bc4d02fb13116bd6130fae.
2019-10-18 14:02:35 -07:00
Chen Qin
86ed01c4bb
[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests ( #4876 )
...
* Expose sets of rabit configurations to spark layer
2019-10-18 15:07:31 -04:00
Chen Qin
512f037e55
[rabit_bootstrap_cache ] failed xgb worker recover from other workers ( #4808 )
...
* Better recovery support. Restarting only the failed workers.
2019-09-16 23:31:52 -04:00
Matthew Jones
b43f08bea5
updating rabit commit hash ( #4718 )
2019-07-30 06:51:54 -04:00
Nan Zhu
01b0c9047c
[jvm-packages] allowing chaining prediction ( #4667 )
...
* add test for chaining prediction
* update rabit
* Update XGBoostGeneralSuite.scala
2019-07-17 08:50:27 -07:00
Nan Zhu
abffbe014e
[jvm-packages] delete all constraints from spark layer about obj and eval metrics and handle error in jvm layer ( #4560 )
...
* temp
* prediction part
* remove supported*
* add for test
* fix param name
* add rabit
* update rabit
* return value of rabit init
* eliminate compilation warnings
* update rabit
* shutdown
* update rabit again
* check sparkcontext shutdown
* fix logic
* sleep
* fix tests
* test with relaxed threshold
* create new thread each time
* stop for job quitting
* udpate rabit
* update rabit
* update rabit
* update git modules
2019-06-27 08:47:37 -07:00
Nan Zhu
37dc82c3ff
[jvm-packages] allow partial evaluation of dataframe before prediction ( #4407 )
...
* allow partial evaluation of dataframe before prediction
* resume spark test
* comments
* Run unit tests after building JVM packages
2019-04-26 21:02:40 -07:00
Jiaming Yuan
0ff84d950e
Upgrade rabit. ( #4159 )
2019-02-18 22:16:58 +08:00
Chen Qin
87f49995be
update rabit ( #3835 )
2018-10-30 09:15:19 -07:00
Tong He
e6696337e4
Fix CRAN check for lintr ( #3372 )
...
* fix CRAN check
* Update submodules dmlc-core and rabit
* Add kintr to rmingw test
2018-06-18 12:53:52 -07:00
Dave Challis
8efbadcde4
Point rabit submodule at latest commit from master. ( #3330 )
2018-05-28 10:21:10 -07:00
Rory Mitchell
a185ddfe03
Implement GPU accelerated coordinate descent algorithm ( #3178 )
...
* Implement GPU accelerated coordinate descent algorithm.
* Exclude external memory tests for GPU
2018-04-20 14:56:35 +12:00