xgboost/python-package
Julian Quick 2cd109fb98 a more verbose field mismatch error message
This error message can be hard to understand when there are several fields, as shown in the example below. This improves the error message, letting the user know which fields were unexpected or missing.

    import xgboost as xgb
    import pandas as pd
    train = pd.DataFrame({'a':[1], 'b':[2], 'c':[3], 'd':[4], 'f':[2], 'g':2, 'etc etc etc':[11]})
    dtrain = xgb.DMatrix(train.drop('d', axis=1), train.d)
    test = pd.DataFrame({'a':[1], 'b':[2], 'c':[1], 'd':[4], 'e':[2], 'f':[2], 'g':2, 'etc etc etc':[11]})
    dtest = xgb.DMatrix(test)
    modl = xgb.train({}, dtrain)
    modl.predict(dtest)
    
    
    # ValueError: feature_names mismatch: [u'a', u'b', u'c', u'etc etc etc', u'f', u'g'] [u'a', u'b', u'c', u'd', u'e', u'etc etc etc', u'f', u'g']
2016-03-17 18:13:30 -06:00
..
2015-12-11 18:46:15 -06:00
2015-12-12 16:34:07 -08:00
2015-12-11 18:46:15 -06:00
2016-02-26 16:54:13 -08:00

XGBoost Python Package
======================

|PyPI version| |PyPI downloads|

Installation
------------

We are on `PyPI <https://pypi.python.org/pypi/xgboost>`__ now. For
stable version, please install using pip:

-  ``pip install xgboost``
-  Note for windows users: this pip installation may not work on some
   windows environment, and it may cause unexpected errors. pip
   installation on windows is currently disabled for further
   invesigation, please install from github.

For up-to-date version, please install from github.

-  To make the python module, type ``./build.sh`` in the root directory
   of project
-  Make sure you have
   `setuptools <https://pypi.python.org/pypi/setuptools>`__
-  Install with ``cd python-package; python setup.py install`` from this directory.
-  For windows users, please use the Visual Studio project file under
   `windows folder <../windows/>`__. See also the `installation
   tutorial <https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/13043/run-xgboost-from-windows-and-python>`__
   from Kaggle Otto Forum.

Examples
--------

-  Refer also to the walk through example in `demo
   folder <../demo/guide-python>`__
-  See also the `example scripts <../demo/kaggle-higgs>`__ for Kaggle
   Higgs Challenge, including `speedtest
   script <../demo/kaggle-higgs/speedtest.py>`__ on this dataset.

Note
----

-  If you want to build xgboost on Mac OS X with multiprocessing support
   where clang in XCode by default doesn't support, please install gcc
   4.9 or higher using `homebrew <http://brew.sh/>`__
   ``brew tap homebrew/versions; brew install gcc49``
-  If you want to run XGBoost process in parallel using the fork backend
   for joblib/multiprocessing, you must build XGBoost without support
   for OpenMP by ``make no_omp=1``. Otherwise, use the forkserver (in
   Python 3.4) or spawn backend. See the
   `sklearn\_parallel.py <../demo/guide-python/sklearn_parallel.py>`__
   demo.

.. |PyPI version| image:: https://badge.fury.io/py/xgboost.svg
   :target: http://badge.fury.io/py/xgboost
.. |PyPI downloads| image:: https://img.shields.io/pypi/dm/xgboost.svg
   :target: https://pypi.python.org/pypi/xgboost/