xgboost/demo/c-api/external-memory
Philip Hyunsu Cho 6d8afb2218
[CI] Require C++17 + CMake 3.18; Use CUDA 11.8 in CI (#8853)
* Update to C++17

* Turn off unity build

* Update CMake to 3.18

* Use MSVC 2022 + CUDA 11.8

* Re-create stack for worker images

* Allocate more disk space for Windows

* Tempiorarily disable clang-tidy

* RAPIDS now requires Python 3.10+

* Unpin cuda-python

* Use latest NCCL

* Use Ubuntu 20.04 in RMM image

* Mark failing mgpu test as xfail
2023-03-01 09:22:24 -08:00
..

Defining a Custom Data Iterator to Load Data from External Memory

A simple demo for using custom data iterator with XGBoost. The feature is still experimental and not ready for production use. If you are not familiar with C API, please read its introduction in our tutorials and visit the basic demo first.

Defining Data Iterator

In the example, we define a custom data iterator with 2 methods: reset and next. The next method passes data into XGBoost and tells XGBoost whether the iterator has reached its end, and the reset method resets iterations. One important detail when using the C API for data iterator is users need to make sure that the data passed into next method must be kept in memory until the next iteration or reset is called. The external memory DMatrix is not limited to training, but also valid for other features like prediction.