* Implement multi-target for hist.
- Add new hist tree builder.
- Move data fetchers for tests.
- Dispatch function calls in gbm base on the tree type.
- The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1.
- Deterministic GPU. (no atomic add).
- Fix top-k handling.
- Precise definition of MAP. (There are other variants on how to handle top-k).
- Refactor GPU ranking tests.
* Make tree model param a private member.
* Number of features and targets are immutable after construction.
This is to reduce the number of places where we can run configuration.
- Pass obj info into tree updater as const pointer.
This way we don't have to initialize the learner model param before configuring gbm, hence
breaking up the dependency of configurations.
- Define a new tree struct embedded in the `RegTree`.
- Provide dispatching functions in `RegTree`.
- Fix some c++-17 warnings about the use of nodiscard (currently we disable the warning on
the CI).
- Use uint32_t instead of size_t for `bst_target_t` as it has a defined size and can be used
as part of dmlc parameter.
- Hide the `Segment` struct inside the categorical split matrix.
* Fix CPU bin compression with categorical data.
* The bug causes the maximum category to be lesser than 256 or the maximum number of bins when
the input data is dense.
* Extract most of the functionality into `DMatrixCache`.
* Move API entry to independent file to reduce dependency on `predictor.h` file.
* Add test.
---------
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* Fix quantile tests running on multi-gpus
* Run some gtests with multiple GPUs
* fix mgpu test naming
* Instruct NCCL to print extra logs
* Allocate extra space in /dev/shm to enable NCCL
* use gtest_skip to skip mgpu tests
---------
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
* Use array interface for CSC matrix.
Use array interface for CSC matrix and align the interface with CSR and dense.
- Fix nthread issue in the R package DMatrix.
- Unify the behavior of handling `missing` with other inputs.
- Unify the behavior of handling `missing` around R, Python, Java, and Scala DMatrix.
- Expose `num_non_missing` to the JVM interface.
- Deprecate old CSR and CSC constructors.