From e9ab4a1c6cbfac828440f9e37bd95d26e8446622 Mon Sep 17 00:00:00 2001 From: Philip Hyunsu Cho Date: Fri, 23 Nov 2018 04:13:36 -0800 Subject: [PATCH] Address #3933: document limitation of DMLC CSV parser + recommend Pandas (#3934) --- doc/python/python_intro.rst | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/doc/python/python_intro.rst b/doc/python/python_intro.rst index 06ac40292..f9c50da91 100644 --- a/doc/python/python_intro.rst +++ b/doc/python/python_intro.rst @@ -48,9 +48,15 @@ The data is stored in a :py:class:`DMatrix ` object. dtrain = xgb.DMatrix('train.csv?format=csv&label_column=0') dtest = xgb.DMatrix('test.csv?format=csv&label_column=0') - (Note that XGBoost does not support categorical features; if your data contains - categorical features, load it as a NumPy array first and then perform - `one-hot encoding `_.) + .. note:: Categorical features not supported + + Note that XGBoost does not support categorical features; if your data contains + categorical features, load it as a NumPy array first and then perform + `one-hot encoding `_. + + .. note:: Use Pandas to load CSV files with headers + + Currently, the DMLC data parser cannot parse CSV files with headers. Use Pandas (see below) to read CSV files with headers. * To load a NumPy array into :py:class:`DMatrix `: