xgboost

Author	SHA1	Message	Date
Dmitry Razdoburdin	c7e7ce7569	[SYCL] Add nodes initialisation (#10269 ) --------- Co-authored-by: Dmitry Razdoburdin <> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2024-05-21 23:38:52 +08:00
Jiaming Yuan	a5a58102e5	Revamp the rabit implementation. (#10112 ) This PR replaces the original RABIT implementation with a new one, which has already been partially merged into XGBoost. The new one features: - Federated learning for both CPU and GPU. - NCCL. - More data types. - A unified interface for all the underlying implementations. - Improved timeout handling for both tracker and workers. - Exhausted tests with metrics (fixed a couple of bugs along the way). - A reusable tracker for Python and JVM packages.	2024-05-20 11:56:23 +08:00
Dmitry Razdoburdin	f588252481	[sycl] add loss guided hist building (#10251 ) Co-authored-by: Dmitry Razdoburdin <>	2024-05-10 22:35:13 +08:00
Dmitry Razdoburdin	dcc9639b91	[sycl] add data initialisation for training (#10222 ) Co-authored-by: Dmitry Razdoburdin <> Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2024-05-05 12:07:10 +08:00
Dmitry Razdoburdin	58513dc288	[SYCL] Add sampling initialization (#10216 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-04-25 04:35:52 +08:00
Jiaming Yuan	3fbb221fec	[coll] Implement shutdown for tracker and comm. (#10208 ) - Force shutdown the tracker. - Implement shutdown notice for error handling thread in comm.	2024-04-20 04:08:17 +08:00
Dmitry Razdoburdin	6e5c335cea	[SYCL] Add basic features for QuantileHistMaker (#10174 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-04-15 21:24:46 +08:00
Jiaming Yuan	8bad677c2f	Update collective implementation. (#10152 ) * Update collective implementation. - Cleanup resource during `Finalize` to avoid handling threads in destructor. - Calculate the size for allgather automatically. - Use simple allgather for small (smaller than the number of worker) allreduce.	2024-03-30 18:57:31 +08:00
Dmitry Razdoburdin	6a7c6a8ae6	add sycl reaslisation of ghist builder (#10138 ) Co-authored-by: Dmitry Razdoburdin <>	2024-03-23 12:55:25 +08:00
Jiaming Yuan	53fc17578f	Use `std::uint64_t` for row index. (#10120 ) - Use std::uint64_t instead of size_t to avoid implementation-defined type. - Rename to bst_idx_t, to account for other types of indexing. - Small cleanup to the base header.	2024-03-15 18:43:49 +08:00
Dmitry Razdoburdin	617970a0c2	[SYCL] Add split evaluation (#10119 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-03-15 01:46:46 +08:00
Jiaming Yuan	d07b7fe8c8	Small cleanup for mock tests. (#10085 )	2024-03-04 23:32:11 +08:00
Dmitry Razdoburdin	7a61216690	[sycl] add partitioning and related tests (#10080 ) Co-authored-by: Dmitry Razdoburdin <>	2024-03-02 01:49:27 +08:00
Dmitry Razdoburdin	761845f594	[SYCL] Implement row set collection. (#10057 ) Co-authored-by: Dmitry Razdoburdin <>	2024-02-26 21:07:36 +08:00
Dmitry Razdoburdin	057f03cacc	[SYCL] Initial implementation of `GHistIndexMatrix` (#10045 ) Co-authored-by: Dmitry Razdoburdin <>	2024-02-19 04:27:15 +08:00
Dmitry Razdoburdin	234674a0a6	[sync]. Add partition builder. (#10011 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2024-01-31 17:39:48 +08:00
Dmitry Razdoburdin	43897b8296	Sycl implementation for objective functions (#9846 ) --------- Co-authored-by: Dmitry Razdoburdin <>	2023-12-12 14:41:50 +08:00
Dmitry Razdoburdin	381f1d3dc9	Add support inference on SYCL devices (#9800 ) --------- Co-authored-by: Dmitry Razdoburdin <> Co-authored-by: Nikolay Petrov <nikolay.a.petrov@intel.com> Co-authored-by: Alexandra <alexandra.epanchinzeva@intel.com>	2023-12-04 16:15:57 +08:00
Jiaming Yuan	06bdc15e9b	[coll] Pass context to various functions. (#9772 ) * [coll] Pass context to various functions. In the future, the `Context` object would be required for collective operations, this PR passes the context object to some required functions to prepare for swapping out the implementation.	2023-11-08 09:54:05 +08:00
Jiaming Yuan	6c0a190f6d	[coll] Add comm group. (#9759 ) - Implement `CommGroup` for double dispatching. - Small cleanup to tracker for handling abort.	2023-11-07 11:12:31 +08:00
Jiaming Yuan	4da4e092b5	[coll] Improvements and fixes for tracker and allreduce. (#9745 ) - Allow the tracker to wait. - Fix allreduce type cast - Return args from the federated tracker.	2023-11-02 04:06:46 +08:00
Jiaming Yuan	bc995a4865	[coll] Add federated coll. (#9738 ) - Define a new data type, the proto file is copied for now. - Merge client and communicator into `FederatedColl`. - Define CUDA variant. - Migrate tests for CPU, add tests for CUDA.	2023-11-01 04:06:46 +08:00
Jiaming Yuan	80390e6cb6	[coll] Federated comm. (#9732 )	2023-10-31 02:39:55 +08:00
Rong Ou	e164d51c43	Improve allgather functions (#9649 )	2023-10-12 23:31:43 +08:00
Rong Ou	66a0832778	Add tests for gpu_approx (#9553 )	2023-09-07 17:21:58 +08:00
Rong Ou	c928dd4ff5	Support vertical federated learning with `gpu_hist` (#9539 )	2023-09-03 11:37:11 +08:00
Rong Ou	6103dca0bb	Support column split in GPU evaluate splits (#9511 )	2023-08-23 16:33:43 +08:00
Rong Ou	bde1ebc209	Switch back to the GPUIDX macro (#9438 )	2023-08-04 15:14:31 +08:00
Rong Ou	c2b85ab68a	Clean up MGPU C++ tests (#9430 )	2023-08-02 14:31:18 +08:00
Jiaming Yuan	04aff3af8e	Define the new `device` parameter. (#9362 )	2023-07-13 19:30:25 +08:00
Rong Ou	f90771eec6	Fix device communicator dependency (#9346 )	2023-06-29 10:34:30 +08:00
Rong Ou	e70810be8a	Refactor device communicator to make allreduce more flexible (#9295 )	2023-06-14 03:53:03 +08:00
Jiaming Yuan	152e2fb072	Unify test helpers for creating ctx. (#9274 )	2023-06-10 03:35:22 +08:00
Rong Ou	5b69534b43	Support column split in multi-target `hist` (#9171 )	2023-05-26 16:56:05 +08:00
Rong Ou	52311dcec9	Fix multi-threaded gtests (#9148 )	2023-05-10 19:15:32 +08:00
Rong Ou	511d4996b5	Rely on gRPC to generate random port (#9102 )	2023-04-27 09:48:26 +08:00
Rong Ou	42d100de18	Make sure metrics work with federated learning (#9037 )	2023-04-19 15:39:11 +08:00
Jiaming Yuan	fe9dff339c	Convert federated learner test into test suite. (#9018 ) * Convert federated learner test into test suite. - Add specialization to learning to rank.	2023-04-11 09:52:55 +08:00
Rong Ou	15e073ca9d	Make objectives work with vertical distributed and federated learning (#9002 )	2023-04-03 17:07:42 +08:00
Rong Ou	ff26cd3212	More tests for column split and vertical federated learning (#8985 ) Added some more tests for the learner and fit_stump, for both column-wise distributed learning and vertical federated learning. Also moved the `IsRowSplit` and `IsColumnSplit` methods from the `DMatrix` to the `MetaInfo` since in some places we only have access to the `MetaInfo`. Added a new convenience method `IsVerticalFederatedLearning`. Some refactoring of the testing fixtures.	2023-03-28 16:40:26 +08:00
Rong Ou	b240f055d3	Support vertical federated learning (#8932 )	2023-03-22 14:25:26 +08:00
Rong Ou	cbf98cb9c6	Add Allgather to collective communicator (#8765 ) * Add Allgather to collective communicator	2023-02-09 11:31:22 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Rong Ou	a8255ea678	Add an in-memory collective communicator (#8494 )	2022-12-01 00:24:12 +08:00
Philip Hyunsu Cho	2faa744aba	[CI] Test federated learning plugin in the CI (#8325 )	2022-10-12 13:57:39 -07:00
Rong Ou	39afdac3be	Better error message when world size and rank are set as strings (#8316 ) Co-authored-by: jiamingy <jm.yuan@outlook.com>	2022-10-12 15:53:25 +08:00
Rong Ou	a2686543a9	Common interface for collective communication (#8057 ) * implement broadcast for federated communicator * implement allreduce * add communicator factory * add device adapter * add device communicator to factory * add rabit communicator * add rabit communicator to the factory * add nccl device communicator * add synchronize to device communicator * add back print and getprocessorname * add python wrapper and c api * clean up types * fix non-gpu build * try to fix ci * fix std::size_t * portable string compare ignore case * c style size_t * fix lint errors * cross platform setenv * fix memory leak * fix lint errors * address review feedback * add python test for rabit communicator * fix failing gtest * use json to configure communicators * fix lint error * get rid of factories * fix cpu build * fix include * fix python import * don't export collective.py yet * skip collective communicator pytest on windows * add review feedback * update documentation * remove mpi communicator type * fix tests * shutdown the communicator separately Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2022-09-12 15:21:12 -07:00
Jiaming Yuan	bc818316f2	Prepare for improving Windows networking compatibility. (#8234 ) * Prepare for improving Windows networking compatibility. * Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which conflicts with winsock2.h * Define `NOMINMAX` conditionally. * Link the winsock library when mysys32 is used. * Add config file for read the doc.	2022-09-10 15:16:49 +08:00
Rong Ou	14ef38b834	Initial support for federated learning (#7831 ) Federated learning plugin for xgboost: * A gRPC server to aggregate MPI-style requests (allgather, allreduce, broadcast) from federated workers. * A Rabit engine for the federated environment. * Integration test to simulate federated learning. Additional followups are needed to address GPU support, better security, and privacy, etc.	2022-05-05 21:49:22 +08:00
Christian Lorentzen	cf4f019ed6	[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183 ) * Change DefaultEvalMetric of classification from error to logloss * Change default binary metric in plugin/example/custom_obj.cc * Set old error metric in python tests * Set old error metric in R tests * Fix missed eval metrics and typos in R tests * Fix setting eval_metric twice in R tests * Add warning for empty eval_metric for classification * Fix Dask tests Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-02 12:06:47 -07:00

1 2

53 Commits