513 Commits

Author SHA1 Message Date
Rong Ou
a2686543a9
Common interface for collective communication (#8057)
* implement broadcast for federated communicator

* implement allreduce

* add communicator factory

* add device adapter

* add device communicator to factory

* add rabit communicator

* add rabit communicator to the factory

* add nccl device communicator

* add synchronize to device communicator

* add back print and getprocessorname

* add python wrapper and c api

* clean up types

* fix non-gpu build

* try to fix ci

* fix std::size_t

* portable string compare ignore case

* c style size_t

* fix lint errors

* cross platform setenv

* fix memory leak

* fix lint errors

* address review feedback

* add python test for rabit communicator

* fix failing gtest

* use json to configure communicators

* fix lint error

* get rid of factories

* fix cpu build

* fix include

* fix python import

* don't export collective.py yet

* skip collective communicator pytest on windows

* add review feedback

* update documentation

* remove mpi communicator type

* fix tests

* shutdown the communicator separately

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2022-09-12 15:21:12 -07:00
Jiaming Yuan
bc818316f2
Prepare for improving Windows networking compatibility. (#8234)
* Prepare for improving Windows networking compatibility.

* Include dmlc filesystem indirectly as dmlc/filesystem.h includes windows.h, which
  conflicts with winsock2.h
* Define `NOMINMAX` conditionally.
* Link the winsock library when mysys32 is used.
* Add config file for read the doc.
2022-09-10 15:16:49 +08:00
Jiaming Yuan
b5eb36f1af
Add max_cat_threshold to GPU and handle missing cat values. (#8212) 2022-09-07 00:57:51 +08:00
Jiaming Yuan
441ffc017a
Copy data from Ellpack to GHist. (#8215) 2022-09-06 23:05:49 +08:00
Jiaming Yuan
16bca5d4a1
Support CPU input for device QuantileDMatrix. (#8136)
- Copy `GHistIndexMatrix` to `Ellpack` when needed.
2022-08-11 21:21:26 +08:00
Jiaming Yuan
2c70751d1e
Implement iterative DMatrix for CPU. (#8116) 2022-07-26 22:34:21 +08:00
Jiaming Yuan
7785d65c8a
Fix feature weights with multiple column sampling. (#8100) 2022-07-22 20:23:05 +08:00
Jiaming Yuan
4a4e5c7c18
Prepare gradient index for Quantile DMatrix. (#8103)
* Prepare gradient index for Quantile DMatrix.

- Implement push batch with adapter batch.
- Implement `GetFvalue` for prediction.
2022-07-22 17:26:33 +08:00
Rory Mitchell
1be09848a7
Refactor split valuation kernel (#8073) 2022-07-21 15:41:50 +02:00
Jiaming Yuan
ef11b024e8
Cleanup data generator. (#8094)
- Avoid duplicated definition of data shape.
- Explicitly define numpy iterator for CPU data.
2022-07-20 13:48:52 +08:00
Jiaming Yuan
4083440690
Small cleanups to various data types. (#8086)
- Use `bst_bin_t` in batch param constructor.
- Use `StringView` to avoid `std::string` when appropriate.
- Avoid using `MetaInfo` in quantile constructor to limit the scope of parameter.
2022-07-18 22:39:36 +08:00
Jiaming Yuan
abaa593aa0
Fix compiler warnings. (#8059)
- Remove unused parameters.
- Avoid comparison of different signedness.
2022-07-14 05:29:56 +08:00
Rory Mitchell
794cbaa60a
Fuse split evaluation kernels (#8026) 2022-07-05 10:24:31 +02:00
Jiaming Yuan
8746f9cddf
Rename IterativeDMatrix. (#8045) 2022-07-04 18:52:31 +08:00
Rory Mitchell
bc4f802b17
Batch UpdatePosition using cudaMemcpy (#7964) 2022-06-30 17:52:40 +02:00
Jiaming Yuan
f0c1b842bf
Implement sketching with adapter. (#8019) 2022-06-23 00:03:02 +08:00
Jiaming Yuan
142a208a90
Fix compiler warnings. (#8022)
- Remove/fix unused parameters
- Remove deprecated code in rabit.
- Update dmlc-core.
2022-06-22 21:29:10 +08:00
Jiaming Yuan
9b0eb66b78
Fix GPU driver test. (#8008)
* Initialize the training parameter.
2022-06-20 19:37:31 +08:00
Jiaming Yuan
8f8bd8147a
Fix LTR with weighted Quantile DMatrix. (#7975)
* Fix LTR with weighted Quantile DMatrix.

* Better tests.
2022-06-09 01:33:41 +08:00
Jiaming Yuan
1a33b50a0d
Fix compiler warnings. (#7974)
- Remove unused parameters. There are still many warnings that are not yet
addressed. Currently, the warnings in dmlc-core dominate the error log.
- Remove `distributed` parameter from metric.
- Fixes some warnings about signed comparison.
2022-06-06 22:56:25 +08:00
Jiaming Yuan
d48123d23b
Fix rmm build (#7973)
- Optionally switch to c++17
- Use rmm CMake target.
- Workaround compiler errors.
- Fix GPUMetric inheritance.
- Run death tests even if it's built with RMM support.

Co-authored-by: jakirkham <jakirkham@gmail.com>
2022-06-06 20:18:32 +08:00
Rong Ou
80339c3427
Enable distributed GPU training over Rabit (#7930) 2022-05-31 04:09:45 +08:00
Jiaming Yuan
bde4f25794
Handle missing categorical value in CPU evaluator. (#7948) 2022-05-27 14:15:47 +08:00
Jiaming Yuan
18a38f7ca0
Refactor for GHistIndex. (#7923)
* Pass sparse page as adapter, which prepares for quantile dmatrix.
* Remove old external memory code like `rbegin` and extra `Init` function.
* Simplify type dispatch.
2022-05-23 23:04:53 +08:00
Jiaming Yuan
765097d514
Simplify inplace-predict. (#7910)
Pass the `X` as part of Proxy DMatrix instead of an independent `dmlc::any`.
2022-05-18 17:52:00 +08:00
Jiaming Yuan
19775ffe15
Use adapter to initialize column matrix. (#7912) 2022-05-18 16:15:12 +08:00
Rory Mitchell
71d3b2e036
Fuse gpu_hist all-reduce calls where possible (#7867) 2022-05-17 13:27:50 +02:00
Jiaming Yuan
4fcfd9c96e
Fix and cleanup for column matrix. (#7901)
* Fix missed type dispatching for dense columns with missing values.
* Code cleanup to reduce special cases.
* Reduce memory usage.
2022-05-16 21:11:50 +08:00
Jiaming Yuan
1baad8650c
Small cleanup to Column. (#7898)
* Define forward iterator to hide the internal state.
2022-05-15 12:39:10 +08:00
Jiaming Yuan
1b6538b4e5
[breaking] Drop single precision histogram (#7892)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2022-05-13 19:54:55 +08:00
Jiaming Yuan
11d65fcb21
Extract partial sum into an independent function. (#7889) 2022-05-13 14:30:35 +08:00
Rory Mitchell
7ef54e39ec
Small refactor to categoricals (#7858) 2022-05-05 17:47:02 +02:00
Rong Ou
14ef38b834
Initial support for federated learning (#7831)
Federated learning plugin for xgboost:
* A gRPC server to aggregate MPI-style requests (allgather, allreduce, broadcast) from federated workers.
* A Rabit engine for the federated environment.
* Integration test to simulate federated learning.

Additional followups are needed to address GPU support, better security, and privacy, etc.
2022-05-05 21:49:22 +08:00
Jiaming Yuan
317d7be6ee
Always use partition based categorical splits. (#7857) 2022-05-03 22:30:32 +08:00
Rory Mitchell
90cce38236
Remove single_precision_histogram for gpu_hist (#7828) 2022-05-03 14:53:19 +02:00
Jiaming Yuan
fdf533f2b9
[POC] Experimental support for l1 error. (#7812)
Support adaptive tree, a feature supported by both sklearn and lightgbm.  The tree leaf is recomputed based on residue of labels and predictions after construction.

For l1 error, the optimal value is the median (50 percentile).

This is marked as experimental support for the following reasons:
- The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors.
- Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.
2022-04-26 21:41:55 +08:00
Jiaming Yuan
3c9b04460a
Move num_parallel_tree to model parameter. (#7751)
The size of forest should be a property of model itself instead of a training
hyper-parameter.
2022-03-29 02:32:42 +08:00
Jiaming Yuan
64575591d8
Use context in SetInfo. (#7687)
* Use the name `Context`.
* Pass a context object into `SetInfo`.
* Add context to proxy matrix.
* Add context to iterative DMatrix.

This is to remove the use of the default number of threads during `SetInfo` as a follow-up on
removing the global omp variable while preparing for CUDA stream semantic.  Currently, XGBoost
uses the legacy CUDA stream, we will gradually remove them in the future in favor of non-blocking streams.
2022-03-24 22:16:26 +08:00
Jiaming Yuan
4d81c741e9
External memory support for hist (#7531)
* Generate column matrix from gHistIndex.
* Avoid synchronization with the sparse page once the cache is written.
* Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist.
* Remove pruner.
2022-03-22 00:13:20 +08:00
Jiaming Yuan
996cc705af
Small cleanup to hist tree method. (#7735)
* Remove special optimization using number of bins.
* Remove 1-based index for column sampling.
* Remove data layout.
* Unify update prediction cache.
2022-03-20 03:44:55 +08:00
Jiaming Yuan
e78a38b837
Sort sparse page index when constructing DMatrix. (#7731) 2022-03-16 18:01:05 +08:00
Jiaming Yuan
98d6faefd6
Implement slope for Pseduo-Huber. (#7727)
* Add objective and metric.
* Some refactoring for CPU/GPU dispatching using linalg module.
2022-03-14 21:42:38 +08:00
Haoming Chen
04fc575c0e
Run tests in a temporary directory (#7723)
Fix some tests to run in a temporary directory in case the root
directory is not writable. Note that most of tests are already
running in the temporary directory, so this PR just make them
consistent.
2022-03-12 21:24:36 +08:00
Jiaming Yuan
18a4af63aa
Update documents and tests. (#7659)
* Revise documents after recent refactoring and cat support.
* Add tests for behavior of max_depth and max_leaves.
2022-02-26 03:57:47 +08:00
Jiaming Yuan
83a66b4994
Support categorical data for hist. (#7695)
* Extract partitioner from hist.
* Implement categorical data support by passing the gradient index directly into the partitioner.
* Organize/update document.
* Remove code for negative hessian.
2022-02-25 03:47:14 +08:00
Jiaming Yuan
6762c45494
Small cleanup to gradient index and hist. (#7668)
* Code comments.
* Const accessor to index.
* Remove some weird variables in the `Index` class.
* Simplify the `MemStackAllocator`.
2022-02-23 11:37:21 +08:00
Jiaming Yuan
0d0abe1845
Support optimal partitioning for GPU hist. (#7652)
* Implement `MaxCategory` in quantile.
* Implement partition-based split for GPU evaluation.  Currently, it's based on the existing evaluation function.
* Extract an evaluator from GPU Hist to store the needed states.
* Added some CUDA stream/event utilities.
* Update document with references.
* Fixed a bug in approx evaluator where the number of data points is less than the number of categories.
2022-02-15 03:03:12 +08:00
Jiaming Yuan
2369d55e9a
Add tests for prediction cache. (#7650)
* Extract the test from approx for other tree methods.
* Add note on how it works.
2022-02-15 00:28:00 +08:00
Jiaming Yuan
2775c2a1ab
Prepare external memory support for hist. (#7638)
This PR prepares the GHistIndexMatrix to host the column matrix which is used by the hist tree method by accepting sparse_threshold parameter.

Some cleanups are made to ensure the correct batch param is being passed into DMatrix along with some additional tests for correctness of SimpleDMatrix.
2022-02-10 16:58:02 +08:00
Jiaming Yuan
81210420c6
Remove omp_get_max_threads (#7608)
This is the one last PR for removing omp global variable.

* Add context object to the `DMatrix`.  This bridges `DMatrix` with https://github.com/dmlc/xgboost/issues/7308 .
* Require context to be available at the construction time of booster.
* Add `n_threads` support for R csc DMatrix constructor.
* Remove `omp_get_max_threads` in R glue code.
* Remove threading utilities that rely on omp global variable.
2022-01-28 16:09:22 +08:00