* Add hessian to batch param in preparation of new approx impl.
* Extract a push method for gradient index matrix.
* Use span instead of vector ref for hessian in sketching.
* Create a binary format for gradient index.
On GPU we use rouding factor to truncate the gradient for deterministic results. This PR changes the gradient representation to fixed point number with exponent aligned with rounding factor.
[breaking] Drop non-deterministic histogram.
Use fixed point for shared memory.
This PR is to improve the performance of GPU Hist.
Co-authored-by: Andy Adinets <aadinets@nvidia.com>
- Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves.
- Remove use of threaded iterator and IO queue.
- Remove `page_size`.
- Make sure the number of pages in memory is bounded.
- Make sure the cache can not be violated.
- Provide an interface for internal algorithms to process data asynchronously.
The role of ProxyDMatrix is going beyond what it was designed. Now it's used by both
QuantileDeviceDMatrix and inplace prediction. After the refactoring of sparse DMatrix it
will also be used for external memory. Renaming the C API to extract it from
QuantileDeviceDMatrix.
Other than modularizing the split evaluation function, this PR also removes some more functions including `InitNewNodes` and `BuildNodeStats` among some other unused variables. Also, scattered code like setting leaf weights is grouped into the split evaluator and `NodeEntry` is simplified and made private. Another subtle difference with the original implementation is that the modified code doesn't call `tree[nidx].Parent()` to traversal upward.
* Support categorical data for dask functional interface and DQM.
* Implement categorical data support for GPU GK-merge.
* Add support for dask functional interface.
* Add support for DQM.
* Get newer cupy.
* Categorical prediction with CPU predictor and GPU predict leaf.
* Implement categorical prediction for CPU prediction.
* Implement categorical prediction for GPU predict leaf.
* Refactor the prediction functions to have a unified get next node function.
Co-authored-by: Shvets Kirill <kirill.shvets@intel.com>
The guard protects the global variable from being changed by XGBoost. But this leads to a
bug that the `n_threads` parameter is no longer used after the first iteration. This is
due to the fact that `omp_set_num_threads` is only called once in `Learner::Configure` at
the beginning of the training process.
The guard is still useful for `gpu_id`, since this is called all the times in our codebase
doesn't matter which iteration we are currently running.
* Re-implement ROC-AUC.
* Binary
* MultiClass
* LTR
* Add documents.
This PR resolves a few issues:
- Define a value when the dataset is invalid, which can happen if there's an
empty dataset, or when the dataset contains only positive or negative values.
- Define ROC-AUC for multi-class classification.
- Define weighted average value for distributed setting.
- A correct implementation for learning to rank task. Previous
implementation is just binary classification with averaging across groups,
which doesn't measure ordered learning to rank.
* Ensure RMM is 0.18 or later
* Add use_rmm flag to global configuration
* Modify XGBCachingDeviceAllocatorImpl to skip CUB when use_rmm=True
* Update the demo
* [CI] Pin NumPy to 1.19.4, since NumPy 1.19.5 doesn't work with latest Shap
* Save feature info in booster in JSON model.
* [breaking] Remove automatic feature name generation in `DMatrix`.
This PR is to enable reliable feature validation in Python package.
* Use normal predictor for dart booster.
* Implement `inplace_predict` for dart.
* Enable `dart` for dask interface now that it's thread-safe.
* categorical data should be working out of box for dart now.
The implementation is not very efficient as it has to pull back the data and
apply weight for each tree, but still a significant improvement over previous
implementation as now we no longer binary search for each sample.
* Fix output prediction shape on dataframe.
* Stop printing out message.
* Remove R specialization.
The printed message is not really useful anyway, without a reproducible example
there's no way to fix it. But if there's a reproducible example, we can always
obtain these information by a debugger. Removing the `printf` function avoids
creating the context in kernel.
* Add a new API function for predicting on `DMatrix`. This function aligns
with rest of the `XGBoosterPredictFrom*` functions on semantic of function
arguments.
* Purge `ntree_limit` from libxgboost, use iteration instead.
* [dask] Use `inplace_predict` by default for dask sklearn models.
* [dask] Run prediction shape inference on worker instead of client.
The breaking change is in the Python sklearn `apply` function, I made it to be
consistent with other prediction functions where `best_iteration` is used by
default.
* Accept array interface for csr and array.
* Accept an optional proxy dmatrix for metainfo.
This constructs an explicit `_ProxyDMatrix` type in Python.
* Remove unused doc.
* Add strict output.