Commit Graph

757 Commits

Author SHA1 Message Date
Your Name
ea19555474 temp merge, disable 1 line, SetValid 2023-10-12 16:16:44 -07:00
Rong Ou
e164d51c43 Improve allgather functions (#9649) 2023-10-12 23:31:43 +08:00
Jiaming Yuan
946ae1c440 [coll] Implement a new tracker and a communicator. (#9650)
* [coll] Implement a new tracker and a communicator.

The new tracker and communicators communicate through the use of JSON documents. Along
with which, communicators are aware of each other.
2023-10-12 12:49:16 +08:00
James Lamb
2e42f33fc1 [CI] standardize else() and enfunction() calls in CMake scripts (#9653) 2023-10-12 11:14:19 +08:00
Rong Ou
0ecb4de963 [breaking] Change DMatrix construction to be distributed (#9623)
* Change column-split DMatrix construction to be distributed

* remove splitting code for row split
2023-10-10 23:35:57 +08:00
Jiaming Yuan
b14e535e78 [Coll] Implement get host address in libxgboost. (#9644)
- Port `xgboost.tracker.get_host_ip` in C++.
2023-10-10 10:01:14 +08:00
Jiaming Yuan
680d53db43 Extract JSON utils. (#9645) 2023-10-10 07:15:14 +08:00
James Lamb
db8d117f7e [CI] standardize endif() calls in CMake scripts (#9637) 2023-10-08 11:45:20 +08:00
Rong Ou
3f2093fb81 Test monotone constraints with column split (#9613) 2023-09-28 04:54:53 +08:00
Rong Ou
d6d14d0fb9 Integration tests for interaction constraints with column-wise data split (#9611) 2023-09-27 08:27:43 +08:00
Rong Ou
290b17ffda Test column sampler with column-wise data split (#9609) 2023-09-26 13:31:23 +08:00
Rong Ou
def77870f3 Test categorical features with column-split gpu quantile (#9595) 2023-09-23 09:55:09 +08:00
Jiaming Yuan
8c676c889d Remove internal use of gpu_id. (#9568) 2023-09-20 23:29:51 +08:00
Jiaming Yuan
38ac52dd87 Build a simple event loop for collective. (#9593) 2023-09-20 02:09:07 +08:00
Rong Ou
d8c3cc92ae More support for column split in gpu predictor (#9562) 2023-09-14 08:13:13 +08:00
Jiaming Yuan
300f9ace06 Fix default metric configuration. (#9575) 2023-09-13 13:05:47 -07:00
Jiaming Yuan
b438d684d2 Utilities and cleanups for socket. (#9576)
- Use c++-17 nodiscard and nested ns.
- Add bind method to socket.
- Remove rabit parameters.
2023-09-14 01:41:42 +08:00
Rong Ou
66a0832778 Add tests for gpu_approx (#9553) 2023-09-07 17:21:58 +08:00
Jiaming Yuan
adea842c83 Fix inplace predict with fallback when base margin is used. (#9536)
- Copy meta info from proxy DMatrix.
- Use `std::call_once` to emit less warnings.
2023-09-05 01:04:24 +08:00
Rong Ou
c928dd4ff5 Support vertical federated learning with gpu_hist (#9539) 2023-09-03 11:37:11 +08:00
Rong Ou
9bab06cbca Support column split in gpu hist updater (#9384) 2023-08-31 18:09:35 +08:00
Jiaming Yuan
ccfc90e4c6 [rabit] Improved connection handling. (#9531)
- Enable timeout.
- Report connection error from the system.
- Handle retry for both tracker connection and peer connection.
2023-08-30 13:00:04 +08:00
Jiaming Yuan
ddf2e68821 Use the new DeviceOrd in the linalg module. (#9527) 2023-08-29 13:37:29 +08:00
Jiaming Yuan
972730cde0 Use matrix for gradient. (#9508)
- Use the `linalg::Matrix` for storing gradients.
- New API for the custom objective.
- Custom objective for multi-class/multi-target is now required to return the correct shape.
- Custom objective for Python can accept arrays with any strides. (row-major, column-major)
2023-08-24 05:29:52 +08:00
Rong Ou
6103dca0bb Support column split in GPU evaluate splits (#9511) 2023-08-23 16:33:43 +08:00
Jiaming Yuan
3c09399f29 Fix device dispatch for linear updater. (#9507) 2023-08-23 00:17:35 +08:00
Jiaming Yuan
044fea1281 Drop support for loading remote files. (#9504) 2023-08-21 23:34:05 +08:00
Jiaming Yuan
1caa93221a Use realloc for histogram cache and expose the cache limit. (#9455) 2023-08-10 14:05:27 +08:00
Jiaming Yuan
f05a23b41c Use weakref instead of id for DataIter cache. (#9445)
- Fix case where Python reuses id from freed objects.
- Small optimization to column matrix with QDM by using `realloc` instead of copying data.
2023-08-10 00:40:06 +08:00
Philip Hyunsu Cho
7ce090e775 Handle UTF-8 paths correctly on Windows platform (#9443)
* Fix round-trip serialization with UTF-8 paths

* Add compiler version check

* Add comment to C API functions

* Add Python tests

* [CI] Updatre MacOS deployment target

* Use std::filesystem instead of dmlc::TemporaryDirectory
2023-08-07 23:27:25 -07:00
Jiaming Yuan
54029a59af Bound the size of the histogram cache. (#9440)
- A new histogram collection with a limit in size.
- Unify histogram building logic between hist, multi-hist, and approx.
2023-08-08 03:21:26 +08:00
Rong Ou
bde1ebc209 Switch back to the GPUIDX macro (#9438) 2023-08-04 15:14:31 +08:00
Jiaming Yuan
1332ff787f Unify the code path between local and distributed training. (#9433)
This removes the need for a local histogram space during distributed training, which cuts the cache size by half.
2023-08-03 21:46:36 +08:00
Jiaming Yuan
e93a274823 Small cleanup for histogram routines. (#9427)
* Small cleanup for histogram routines.

- Extract hist train param from GPU hist.
- Make histogram const after construction.
- Unify parameter names.
2023-08-02 18:28:26 +08:00
Rong Ou
c2b85ab68a Clean up MGPU C++ tests (#9430) 2023-08-02 14:31:18 +08:00
Jiaming Yuan
912e341d57 Initial GPU support for the approx tree method. (#9414) 2023-07-31 15:50:28 +08:00
Rong Ou
7579905e18 Retry switching to per-thread default stream (#9416) 2023-07-26 07:09:12 +08:00
Jiaming Yuan
3a9996173e Revert "Switch to per-thread default stream (#9396)" (#9413)
This reverts commit f7f673b00c.
2023-07-24 12:03:28 -07:00
Jiaming Yuan
a196443a07 Implement sketching with Hessian on GPU. (#9399)
- Prepare for implementing approx on GPU.
- Unify the code path between weighted and uniform sketching on DMatrix.
2023-07-24 15:43:03 +08:00
Jiaming Yuan
22b0a55a04 Remove hist builder class. (#9400)
* Remove hist build class.

* Cleanup this stateless class.

* Add comment to thread block.
2023-07-22 10:43:12 +08:00
Jiaming Yuan
0de7c47495 Fix metric serialization. (#9405) 2023-07-22 08:39:21 +08:00
Rong Ou
f7f673b00c Switch to per-thread default stream (#9396) 2023-07-20 08:21:00 +08:00
Jiaming Yuan
04aff3af8e Define the new device parameter. (#9362) 2023-07-13 19:30:25 +08:00
Rong Ou
3632242e0b Support column split with GPU quantile (#9370) 2023-07-11 12:15:56 +08:00
Jiaming Yuan
97ed944209 Unify the hist tree method for different devices. (#9363) 2023-07-11 10:04:39 +08:00
Jiaming Yuan
20c52f07d2 Support exporting cut values (#9356) 2023-07-08 15:32:41 +08:00
Rong Ou
15ca12a77e Fix NCCL test hang (#9367) 2023-07-07 11:21:35 +08:00
Jiaming Yuan
41c6813496 Preserve order of saved updaters config. (#9355)
- Save the updater sequence as an array instead of object.
- Warn only once.

The compatibility is kept, but we should be able to break it as the config is not loaded
in pickle model and it's declared to be not stable.
2023-07-05 20:20:07 +08:00
Jiaming Yuan
645037e376 Improve test coverage with predictor configuration. (#9354)
* Improve test coverage with predictor configuration.

- Test with ext memory.
- Test with QDM.
- Test with dart.
2023-07-05 15:17:22 +08:00
Jiaming Yuan
d0916849a6 Remove unused weight from buffer for cat features. (#9341) 2023-07-04 01:07:09 +08:00