*Fix 1439
*Fix python_wrapper when eval set name contain '-' will cause early_stop maximize variable con't set to True propely
Change-Id: Ib0595afd4ae7b445a84c00a3a8faeccc506c6d13
* add scikit-learn v0.18 compatibility
import KFold & StratifiedKFold from sklearn.model_selection instead of sklearn.cross_validation
* change DeprecationWarning to ImportError
DeprecationWarning isn't an exception, so it should work the other way around.
Currently xgboost can only be installed by running:
python setup.py install
Now it can be packaged (in binary form) as a wheel and installed like:
pip install xgboost-0.6-py2-none-any.whl
distutils and wheel install `data_files` differently than setuptools.
setuptools will install the `data_files` in the package directory whereas the
others install it in `sys.prefix`. By adding `sys.prefix` to the list of
directories to check for the shared library, xgboost can now be distributed as
a wheel.
* force gcc-5 or clang-omp for Mac OS, prepare for pip pack
* add sklearn dep, make -j4
* finalize PyPI submission
* revert to Xcode clang for passing build #1468
* force to clang, try to solve cmake travis error
* remove sklearn dependency
* added new function to calculate other feature importances
* added capability to plot other feature importance measures
* changed plotting default to fscore
* added info on importance_type to boilerplate comment
* updated text of error statement
* added self module name to fix call
* added unit test for feature importances
* style fixes
This error message can be hard to understand when there are several fields, as shown in the example below. This improves the error message, letting the user know which fields were unexpected or missing.
import xgboost as xgb
import pandas as pd
train = pd.DataFrame({'a':[1], 'b':[2], 'c':[3], 'd':[4], 'f':[2], 'g':2, 'etc etc etc':[11]})
dtrain = xgb.DMatrix(train.drop('d', axis=1), train.d)
test = pd.DataFrame({'a':[1], 'b':[2], 'c':[1], 'd':[4], 'e':[2], 'f':[2], 'g':2, 'etc etc etc':[11]})
dtest = xgb.DMatrix(test)
modl = xgb.train({}, dtrain)
modl.predict(dtest)
# ValueError: feature_names mismatch: [u'a', u'b', u'c', u'etc etc etc', u'f', u'g'] [u'a', u'b', u'c', u'd', u'e', u'etc etc etc', u'f', u'g']