2907 Commits

Author SHA1 Message Date
xgdgsc
27ca50e2c2 change contribution link to open issues (#1834) 2016-12-02 11:03:03 -08:00
ccphillippi
dd477ac903 Move feature_importances_ to base XGBModel for XGBRegressor access (#1591) 2016-12-01 10:17:37 -08:00
AbdealiJK
6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
Dr. Kashif Rasul
da2556f58a fixed some typos (#1814) 2016-11-25 16:34:57 -05:00
RAMitchell
be2f28ec08 Update build instructions, improve memory usage (#1811) 2016-11-25 09:43:22 -08:00
Yuan (Terry) Tang
80c8515457 Bump up the date of R package (#1813) 2016-11-25 03:20:18 -05:00
Jivan Roquet
0c19d4b029 [python-package] Provide a learning_rates parameter to xgb.cv() (#1770)
* Allow using learning_rates parameter when doing CV

- Create a new `callback_cv` method working when called from `xgb.cv()`
- Rename existing `callback` into `callback_train` and make it the default callback
- Get the logic out of the callbacks and place it into a common helper

* Add a learning_rates parameter to cv()

* lint

* remove caller explicit reference

* callback is aware of its calling context

* remove caller argument

* remove learning_rates param

* restore learning_rates for training, but deprecated

* lint

* lint line too long

* quick example for predefined callbacks
2016-11-24 09:49:07 -08:00
Alexey Grigorev
80e70c56b9 [jvm-packages] xgboost4j: publishing sources along with bins (#1797)
* xgboost4j: publishing sources along with bins

* description about building maven artifacts

* publishing scala source to local m2 as well
2016-11-21 15:02:57 -05:00
Ruimin Wang
d80cec3384 [jvm-pacakges] the first parameter in getModelDump should be featuremap path not model path (#1788)
* fix the model dump in xgboost4j example

* Modify the dump model part of scala version

* add the forgotten modelInfos
2016-11-21 08:52:26 -05:00
AbdealiJK
97371ff7e5 c_api.cc: Bring back silent argument (#1794)
In ecb3a271bed151252fb048528ce5a90ad75bb68f the silent argument
in XGDMatrixCreateFromFile of c_api.cc was always overridden to
be false. This disabled the functionality to hide log messages.

This commit reverts that part to enable the hiding of log messages.
2016-11-20 22:04:36 -08:00
Nan Zhu
965091c4bb [jvm-packages] update methods in test cases to be consistent (#1780)
* add back train method but mark as deprecated

* fix scalastyle error

* change class to object in examples

* fix compilation error

* update methods in test cases to be consistent

* add blank lines

* fix
2016-11-20 22:49:18 -05:00
XianXing Zhang
ce708c8e7f [jvm-packages] Leverage the Spark ml API to read DataFrame from files in LibSVM format. (#1785) 2016-11-20 21:28:03 -05:00
Yuan (Terry) Tang
ca0069b708 Fix typo - eval_metric in param should be dictionary (#1791) 2016-11-20 18:52:41 -06:00
Yuan (Terry) Tang
090b37e85d Bumped up err assert in glm test (#1792) 2016-11-20 18:23:19 -06:00
Nan Zhu
5217e53156 stylistic fix (#1789)
* stylistic fix

* try multiple repos

* fix

* fix
2016-11-19 22:03:10 -05:00
Tianqi Chen
060a0ac396 Update setup.sh 2016-11-19 17:57:47 -08:00
Tianqi Chen
aa841ee58d Update setup.sh 2016-11-19 17:56:36 -08:00
baderbuddy
c52b2faba4 Added license information (#1783)
Added license information to the setup.py
2016-11-17 13:36:47 -08:00
Tony DiFranco
f11f2bd5fd add default to poisson -> max_delta_step to enable loading/saving/dumping of model (#1781) 2016-11-16 14:25:00 -08:00
Simon DENEL
58aa1129ea Fixing a few typos (#1771)
* Fixing a few typos

* Fixing a few typos
2016-11-13 15:47:52 -08:00
Richard Wong
b9a9d2bf45 Style fixes in Python documentation. (#1764) 2016-11-11 09:26:28 -08:00
Luckick
0ccb9b87d0 Typo Problem (#1759)
cross validation
2016-11-10 13:55:09 -08:00
Tianqi Chen
2fb19eb448 Add appveyor badge 2016-11-10 12:49:33 -08:00
Zhongxiao Ma
55bfc29942 keep builtin evaluations while using customized evaluation function (#1624)
* keep builtin evaluations while using customized evaluation function

* fix concat bytes to str
2016-11-10 12:40:48 -08:00
Morten Hustveit
8b9d9669bb Have ConsoleLogger log to stderr instead of stdout (#1714)
On Unix systems, it's common for programs to read their input from stdin, and
write their output to stdout.  Messages should be written to stderr, where they
won't corrupt a program's output, and where they can be seen by the user even
if the output is being redirected.

This is mostly a problem when XGBoost is being used from Python or from another
program.
2016-11-10 12:39:52 -08:00
wl2776
6b5a23ccd5 fix build in MSVC 2013 (#1757) 2016-11-10 12:34:30 -08:00
RAMitchell
e3a7f85f15 GPU plug-in improvements + basic Windows continuous integration (#1752)
* GPU Plugin: Reduce memory, improve performance, fix gcc compiler bug, add
out of memory exceptions

* Add basic Windows continuous integration for cmake VS2013, VS2015
2016-11-10 12:34:09 -08:00
joandre
91b75f9b41 Fix a small typo in GeneralParams class. Change customEval parameter name from "custom_obj" to "custom_eval". (#1741) 2016-11-06 12:44:49 -05:00
Tony DiFranco
2ad0948444 Tweedie Regression Post-Rebase (#1737)
* add support for tweedie regression

* added back readme line that was accidentally deleted

* fixed linting errors

* add support for tweedie regression

* added back readme line that was accidentally deleted

* fixed linting errors

* rebased with upstream master and added R example

* changed parameter name to tweedie_variance_power

* linting error fix

* refactored tweedie-nloglik metric to be more like the other parameterized metrics

* added upper and lower bound check to tweedie metric

* add support for tweedie regression

* added back readme line that was accidentally deleted

* fixed linting errors

* added upper and lower bound check to tweedie metric

* added back readme line that was accidentally deleted

* rebased with upstream master and added R example

* rebased again on top of upstream master

* linting error fix

* added upper and lower bound check to tweedie metric

* rebased with master

* lint fix

* removed whitespace at end of line 186 - elementwise_metric.cc
2016-11-05 17:02:32 -07:00
AbdealiJK
52b9867be5 Add docs fro update_seq (#1735)
* Fix typos and messages in docs

* parameter.md: Add docs for updater_seq

Mention the updater_seq parameter which sets the order of the tree
updaters to run and also specifies which ones to run. This can be
useful when pruning is not required or even a custom plugin is
being built along with xgboost.
2016-11-04 16:07:29 -07:00
AbdealiJK
b94fcab4dc Add dump_format=json option (#1726)
* Add format to the params accepted by DumpModel

Currently, only the test format is supported when trying to dump
a model. The plan is to add more such formats like JSON which are
easy to read and/or parse by machines. And to make the interface
for this even more generic to allow other formats to be added.

Hence, we make some modifications to make these function generic
and accept a new parameter "format" which signifies the format of
the dump to be created.

* Fix typos and errors in docs

* plugin: Mention all the register macros available

Document the register macros currently available to the plugin
writers so they know what exactly can be extended using hooks.

* sparce_page_source: Use same arg name in .h and .cc

* gbm: Add JSON dump

The dump_format argument can be used to specify what type
of dump file should be created. Add functionality to dump
gblinear and gbtree into a JSON file.

The JSON file has an array, each item is a JSON object for the tree.
For gblinear:
 - The item is the bias and weights vectors
For gbtree:
 - The item is the root node. The root node has a attribute "children"
   which holds the children nodes. This happens recursively.

* core.py: Add arg dump_format for get_dump()
2016-11-04 09:55:25 -07:00
Alireza Bagheri Garakani
9c693f0f5f scale_pos_weight default value (#1712)
Should say 1 (not 0)
2016-11-03 12:52:26 -07:00
David Lichtenberg
8156b71912 Typo is OSX installation instructions (#1718)
The `cd ..;` in the one liner takes you up a directory instead of into the xgboost directory. This will cause that step of the installation to fail. It seems like you are meant to enter the xgboost directory as you did in the instructions for installing xgboost without openmp.
2016-11-03 12:52:16 -07:00
AbdealiJK
378eb7d7c8 Fix typos and messages in docs (#1723) 2016-10-30 22:52:19 -07:00
Nan Zhu
6082184cd1 [jvm-packages] update API docs (#1713)
* add back train method but mark as deprecated

* fix scalastyle error

* update java doc

* update
2016-10-27 18:53:22 -07:00
Nan Zhu
d321375df5 [jvm-packages] Fix mis configure of nthread (#1709)
* add back train method but mark as deprecated

* fix scalastyle error

* change class to object in examples

* fix compilation error

* fix mis configuration
2016-10-27 12:10:35 -04:00
Nan Zhu
f12074d355 [jvm-packages] release blog (#1706) 2016-10-26 21:35:42 -04:00
Nan Zhu
f801c22710 [jvm-packages] change class to object in examples (#1703)
* change class to object in examples

* fix compilation error
2016-10-26 14:54:56 -04:00
Nan Zhu
016ab89484 [jvm-packages] Parameter tuning tool for XGBoost (#1664) 2016-10-23 16:58:18 -04:00
RAMitchell
ac41845d4b Add GPU accelerated tree construction plugin (#1679) 2016-10-20 20:14:47 -07:00
Eric Liu
9b2e41340b make DMatrix._init_from_npy2d only copy data when necessary (#1637)
* make DMatrix._init_from_npy2d only copy data when necessary

When creating DMatrix from a 2d ndarray, it can unnecessarily copy the input data. This can be problematic when the data is already very large--running out of memory. The copy is temporary (going out of scope at the end of this function) but it still adds to peak memory usage.

``numpy.array`` copies its input no matter what by default. By adding ``copy=False``, it will only do so when necessary. Since XGDMatrixCreateFromMat is readonly on the input buffer, this copy is not needed.

Also added comments explaining when a copy can happen (if data ordering/layout is wrong or if type is not 32-bit float).

* remove whitespace
2016-10-20 09:30:52 -07:00
Jan Gorecki
e79a803a30 simplify installation of R pkg devel version (#1653) 2016-10-18 10:24:01 -07:00
Liam Huang
001d8c4023 correct CalcDCG in rank_metric.cc and rank_obj.cc (#1642)
* correct CalcDCG in rank_metric.cc

DCG use log base-2, however `std::log` returns log base-e.

* correct CalcDCG in rank_obj.cc

DCG use log base-2, however `std::log` returns log base-e.

* use std::log2 instead of std::log

 make it more elegant

* use std::log2 instead of std::log

make it more elegant
2016-10-18 10:23:41 -07:00
ziguang1216
94a9e3222e [python-package] Fix the issue #1439 (#1666)
*Fix 1439
        *Fix python_wrapper when eval set name contain '-' will cause early_stop maximize variable con't set to True propely

Change-Id: Ib0595afd4ae7b445a84c00a3a8faeccc506c6d13
2016-10-18 10:22:51 -07:00
EQGM
d3fc815b45 fix the problem that there is no libxgboost.dll (#1674)
fix the problem that there is no libxgboost.dll built with Visual Studio.
2016-10-18 09:56:48 -07:00
saihttam
4b9d488387 Add option on OSX to use macports (#1675) 2016-10-18 09:56:00 -07:00
Adam Pocock
445029bb82 [jvm-packages] XGBoost4j Windows fixes (#1639)
* Changes for Mingw64 compilation to ensure long is a consistent size.

Mainly impacts the Java API which would not compile, but there may be
silent errors on Windows with large datasets before this patch (as long
is 32-bits when compiled with mingw64 even in 64-bit mode).

* Adding ifdefs to ensure it still compiles on MacOS

* Makefile and create_jni.bat changes for Windows.

* Switching XGDMatrixCreateFromCSREx JNI call to use size_t cast

* Fixing lint error, adding profile switching to jvm-packages build to make create-jni.bat get called, adding myself to Contributors.Md
2016-10-18 08:35:25 -04:00
Jiading Gai
be90deb9b6 Fix a bug to handle Executable and Library with same name (xgboost) correctly. (#1669)
add_library(libxgboost SHARED ${SOURCES}) builds a library named
liblibxgboost.so; However, simply changing it to add_library(xgboost ...)
won't work, as add_executable(xgboost ...) and add_library(xgbboost ...)
will then have the same target name. This patch correctly handles the
same-name situation through SET_TARGET_PROPERTIES.
2016-10-15 18:29:40 -07:00
Nan Zhu
f5c776f64f [jvm-packages] add apache maven repo url and bump up default spark version to 2.0.1 (#1650)
* add apache maven repo url and bump up default spark version to 2.0.1
2016-10-13 08:55:03 -04:00
Nan Zhu
813a53882a [jvm-packages] deprecate Flaky test (#1662)
* deprecate flaky test
2016-10-13 07:21:24 -04:00