From 58d211545fa2bb4dfb26904042c42641a13dbfb8 Mon Sep 17 00:00:00 2001
From: Nick Becker <nickb500@gmail.com>
Date: Wed, 23 Nov 2022 07:58:28 -0500
Subject: [PATCH] explain cpu/gpu interop and link to model IO tutorial (#8450)

---
 doc/gpu/index.rst | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/doc/gpu/index.rst b/doc/gpu/index.rst
index 4187030c2..716ad0d58 100644
--- a/doc/gpu/index.rst
+++ b/doc/gpu/index.rst
@@ -70,6 +70,12 @@ Working memory is allocated inside the algorithm proportional to the number of r
 
 If you are getting out-of-memory errors on a big dataset, try the or :py:class:`xgboost.QuantileDMatrix` or :doc:`external memory version </tutorials/external_memory>`. Note that when ``external memory`` is used for GPU hist, it's best to employ gradient based sampling as well. Last but not least, ``inplace_predict`` can be preferred over ``predict`` when data is already on GPU. Both ``QuantileDMatrix`` and ``inplace_predict`` are automatically enabled if you are using the scikit-learn interface.
 
+
+CPU-GPU Interoperability
+========================
+XGBoost models trained on GPUs can be used on CPU-only systems to generate predictions. For information about how to save and load an XGBoost model, see :doc:`/tutorials/saving_model`.
+
+
 Developer notes
 ===============
 The application may be profiled with annotations by specifying USE_NTVX to cmake. Regions covered by the 'Monitor' class in CUDA code will automatically appear in the nsight profiler when `verbosity` is set to 3.