2856 Commits

Author SHA1 Message Date
Nan Zhu
b56c6097d9 [jvm-packages] add Spark and XGBoost tutorial (#1649)
* add back train method but mark as deprecated

* add Spark and XGBoost tutorial

* fix scalastyle error
2016-10-11 09:41:24 -07:00
Tianqi Chen
8a7a6dba71 Update .travis.yml 2016-10-09 20:37:57 -07:00
Jonathan Rahn
c8ae52f17a add scikit-learn v0.18 compatibility (#1636)
* add scikit-learn v0.18 compatibility

import KFold & StratifiedKFold from sklearn.model_selection instead of sklearn.cross_validation

* change DeprecationWarning to ImportError

DeprecationWarning isn't an exception, so it should work the other way around.
2016-10-09 20:37:28 -07:00
Yuan (Terry) Tang
a64fd74421 Fix wrong expected feature types (#1646) 2016-10-08 21:16:29 -07:00
Kirill Sevastyanenko
485b6c86cc rm redundant lines in travis.yml (#1633) 2016-10-08 10:48:58 -07:00
Vadim Khotilovich
f9648ac320 [R-package] store numeric attributes with higher precision (#1628) 2016-10-03 11:01:17 -07:00
Nan Zhu
1673bcbe7e [jvm-packages] separate classification and regression model and integrate with ML package (#1608) 2016-09-30 11:49:03 -04:00
Shengwen Yang
3b9987ca9c Fix the issue 1474 (#1615)
* Fix 1474

* Fix crash issue when saving and loading poisson model

* Rollback the wrong fix
2016-09-29 19:29:47 -07:00
Vadim Khotilovich
3efff6d052 fix for VX (#1614) 2016-09-27 15:19:20 -07:00
Nan Zhu
37bc122c90 [jvm-packages] Robust dmatrix creation (#1613)
* add back train method but mark as deprecated

* robust matrix creation in jvm
2016-09-26 13:35:04 -04:00
phoenixbai
915ac0b8fe the fix of missing value assignment for name_ variable in EvalRankList method (#1558) 2016-09-26 08:57:17 -05:00
Vadim Khotilovich
693ddb860e More robust DMatrix creation from a sparse matrix (#1606)
* [CORE] DMatrix from sparse w/ explicit #col #row; safer arg types

* [python-package] c-api change for _init_from_csr _init_from_csc

* fix spaces

* [R-package] adopt the new XGDMatrixCreateFromCSCEx interface

* [CORE] redirect old sparse creators to new ones
2016-09-25 10:01:22 -07:00
Guido Tapia
e06f6a0df7 Update README.md - added windows binaries (#1600)
Added a link to the nightly windows binaries hosted on Guido Tapia's (my) blog
2016-09-21 23:14:07 -07:00
Guido Tapia
b0bfddba72 Update build.md - added link to nightly windows binaries (#1601)
Apologies for 2 PRs, was easier using githubs interface rather than doing it through git
2016-09-21 23:13:56 -07:00
chanis
62830be29d [python-package] modify libpath.py and fix typos (#1594)
* Update Makefile

* Update Makefile

* modify __init__.py

* modified libpath.py and fixed typos
2016-09-21 10:12:19 -07:00
Vlad Sandulescu
9f8116416b Added KDD Cup 2016 competition (#1596)
merged thanks
2016-09-21 11:47:01 -04:00
reg.zhuce
3ee145b8dc [jvm-packages] IndexOutOfBoundsException (#1589)
ml.dmlc.xgboost4j.scala.spark.XGBoost.scala:51

values is empty when we meet it at first time, so values(0) throw an IndexOutOfBoundsException.
It should be  dVector.values(i) instead of values(i).
2016-09-20 09:13:47 -04:00
chanis
d8876b0b73 [python-package] modify __init__.py (#1587)
* Update Makefile

* Update Makefile

* modify __init__.py
2016-09-19 09:43:36 -07:00
Manuel Schiller
d3c4d19c91 fix spelling mistake (#1584) 2016-09-18 09:52:01 -07:00
Xin Yin
7245145712 [jvm-packages] Fixed the sanity check for parameter 'nthread' against 'spark.task.cpus'. (#1582) 2016-09-16 11:31:35 -04:00
chanis
4041c39090 fix Makefile (#1579)
* Update Makefile

* Update Makefile
2016-09-15 10:44:49 -07:00
Nan Zhu
4ad648e856 [jvm-packages] predictLeaf with Dataframe (#1576)
* add back train method but mark as deprecated

* predictLeaf with Dataset

* fix

* fix
2016-09-15 06:15:47 -04:00
Nan Zhu
bb388cbb31 default eval func (#1574) 2016-09-14 13:26:16 -04:00
Tong He
4733357278 [R] Monotonic Constraints in Tree Construction (#1557)
* fix cran check

* change required R version because of utils::globalVariables

* temporary commit, monotone not working

* fix test

* fix doc

* fix doc
2016-09-11 22:16:33 -07:00
Nan Zhu
fb02797e2a [jvm-packages] Integration with Spark Dataframe/Dataset (#1559)
* bump up to scala 2.11

* framework of data frame integration

* test consistency between RDD and DataFrame

* order preservation

* test order preservation

* example code and fix makefile

* improve type checking

* improve APIs

* user docs

* work around travis CI's limitation on log length

* adjust test structure

* integrate with Spark -1 .x

* spark 2.x integration

* remove spark 1.x implementation but provide instructions on how to downgrade
2016-09-11 15:02:58 -04:00
chanis
7ff742ebf7 Update Makefile (#1566) 2016-09-11 09:48:11 -07:00
Tianqi Chen
c93c9b7ed6 [TREE] Experimental version of monotone constraint (#1516)
* [TREE] Experimental version of monotone constraint

* Allow default detection of montone option

* loose the condition of strict check

* Update gbtree.cc
2016-09-07 21:28:43 -07:00
Norbert
8cac37b2b4 Practical XGBoost in Python online course (#1542) 2016-09-06 11:12:56 -07:00
Tianqi Chen
ecec5f7959 [CORE] Refactor cache mechanism (#1540) 2016-09-02 20:39:07 -07:00
Nan Zhu
6dabdd33e3 [jvm-packages] bump to next version (#1535)
* bump to next version

* fix

* fix
2016-09-01 12:18:21 -04:00
闻波
8cdfec71b3 remove a redundant sentence, and a word 'and' (#1526)
* fix a typo

* fix a typo and some code format

* Update training.py

* delete redundant sentence
2016-08-31 11:51:40 -07:00
JohnStott
fd7c3b3543 MS Visual Studio 2015 fix (#1530)
Fixed to work with future versions of visual studio i.e., 2015

MSVC has it's own section for setting compile parameters, it shouldn't need to fall into section below i.e., checking for c++11 as this is definitely already supported, though this isn't an issue for Visual Studio 2012, it breaks for later versions
of visual studio i.e., 2015 when the default c++ is version 14.  Though still backward compatible with c++11
2016-08-31 11:51:16 -07:00
Nan Zhu
7fb3fbf577 impose shuffle when creating training RDD (#1531) 2016-08-31 07:34:10 -04:00
Nan Zhu
3f198b9fef [jvm-packages] allow training with missing values in xgboost-spark (#1525)
* allow training with missing values in xgboost-spark

* fix compilation error

* fix bug
2016-08-29 21:45:49 -04:00
Dex Groves
6014839961 Fix minor typos in parameters.md (#1521) 2016-08-29 09:02:03 -04:00
Nan Zhu
74db1e8867 [jvm-packages] remove APIs with DMatrix from xgboost-spark (#1519)
* test consistency of prediction functions between DMatrix and RDD

* remove APIs with DMatrix from xgboost-spark

* fix compilation error in xgboost4j-example

* fix test cases
2016-08-28 21:25:49 -04:00
Nan Zhu
6d65aae091 [jvm-packages] test consistency of prediction functions with DMatrix and RDD (#1518)
* test consistency of prediction functions between DMatrix and RDD

* fix the failed test cases
2016-08-28 20:27:03 -04:00
Nan Zhu
d7f79255ec improve test of save/load model (#1515) 2016-08-27 17:16:22 -04:00
kiselev1189
53ce511be3 Fix how maximize_metric value is determined in early_stop (#1451)
* Update callback.py

* Update callback.py
2016-08-27 13:09:24 -07:00
Tianqi Chen
df38f251be Fix warnings from g++5 or higher (#1510) 2016-08-26 16:14:10 -07:00
Preston Parry
0627213544 Fixes typo "candicate" (#1508) 2016-08-26 14:00:27 -07:00
Preston Parry
cf4951b0b0 Fixes another typo "candicate" (#1509) 2016-08-26 14:00:23 -07:00
Dan Harbin
78ae772f2c Make python package wheelable (#1500)
Currently xgboost can only be installed by running:

    python setup.py install

Now it can be packaged (in binary form) as a wheel and installed like:

    pip install xgboost-0.6-py2-none-any.whl

distutils and wheel install `data_files` differently than setuptools.
setuptools will install the `data_files` in the package directory whereas the
others install it in `sys.prefix`. By adding `sys.prefix` to the list of
directories to check for the shared library, xgboost can now be distributed as
a wheel.
2016-08-26 14:00:11 -07:00
Tong He
170b349f3e Fix the "No visible binding" CRAN checks (#1504)
* fix cran check

* change required R version because of utils::globalVariables
2016-08-26 10:24:04 -07:00
Francesco Mosconi
d754ce7dc1 Fixed OpenMP installation on MacOSX with gcc-6 (#1460)
* Fixed OpenMP installation on MacOSX with gcc-6

- Modified makefile from gcc-5 to gcc-6
- Removed deprecated install instructions from doc (gcc-5 was automatically forced if available in makefile on OSX)

* Fixed OpenMP installation on MacOSX with gcc-6

- Modified makefile from gcc-5 to gcc-6
- Removed deprecated install instructions from doc (gcc-5 was automatically forced if available in makefile on OSX)
2016-08-22 10:30:34 -07:00
Frank
93e85139bc fix #1476 (#1494) 2016-08-20 17:27:57 -07:00
Nan Zhu
dc1125eb56 evaluation with RDD data (#1492) 2016-08-20 18:31:10 -04:00
Nan Zhu
582ee63e34 enable train multiple models by distinguishing stage IDs (#1493) 2016-08-20 16:37:07 -04:00
Vadim Khotilovich
bdfa8c0e09 [R-package] a few fixes for R (#1485)
* [R] fix #1465

* [R] add sanity check to fix #1434

* [R] some clean-ups for custom obj&eval; require maximize only for early stopping
2016-08-20 05:09:03 -05:00
Tong He
b8e6551734 Add unittest for garbage collection's safety in R (#1490)
* Add test for garbage collection safety
2016-08-19 16:55:03 -07:00