- Implement a columnar adapter.
- Refactor Python pandas handling code to avoid converting into a single numpy array.
- Add support in R for transforming columns.
- Support R data.frame and factor type.
- Add `numBoostedRound` to jvm packages
- Remove rabit checkpoint version.
- Change the starting version of training continuation in JVM [breaking].
- Redefine the checkpoint version policy in jvm package. [breaking]
- Rename the Python check point callback parameter. [breaking]
- Unifies the checkpoint policy between Python and JVM.
* Fix round-trip serialization with UTF-8 paths
* Add compiler version check
* Add comment to C API functions
* Add Python tests
* [CI] Updatre MacOS deployment target
* Use std::filesystem instead of dmlc::TemporaryDirectory
* Define `best_iteration` only if early stopping is used.
This is the behavior specified by the document but not honored in the actual code.
- Don't set the attributes if there's no early stopping.
- Clean up the code for callbacks, and replace assertions with proper exceptions.
- Assign the attributes when early stopping `save_best` is used.
- Turn the attributes into Python properties.
---------
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* Handle the new `device` parameter in dask and demos.
- Check no ordinal is specified in the dask interface.
- Update demos.
- Update dask doc.
- Update the condition for QDM.
- A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter.
- The `predictor` parameter is removed.
- Fallback to `DMatrix` when `inplace_predict` is not available.
- The heuristic for choosing a predictor is only used during training.
- Rework the precision metric for both CPU and GPU.
- Mention it in the document.
- Cleanup old support code for GPU ranking metric.
- Deterministic GPU implementation.
* Drop support for classification.
* type.
* use batch shape.
* lint.
* cpu build.
* cpu build.
* lint.
* Tests.
* Fix.
* Cleanup error message.
Previously, we use `libsvm` as default when format is not specified. However, the dmlc
data parser is not particularly robust against errors, and the most common type of error
is undefined format.
Along with which, we will recommend users to use other data loader instead. We will
continue the maintenance of the parsers as it's currently used for many internal tests
including federated learning.
* Fix tests with pandas 2.0.
- `is_categorical` is replaced by `is_categorical_dtype`.
- one hot encoding returns boolean type instead of integer type.