577 Commits

Author SHA1 Message Date
AbdealiJK
6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
RAMitchell
be2f28ec08 Update build instructions, improve memory usage (#1811) 2016-11-25 09:43:22 -08:00
AbdealiJK
97371ff7e5 c_api.cc: Bring back silent argument (#1794)
In ecb3a271bed151252fb048528ce5a90ad75bb68f the silent argument
in XGDMatrixCreateFromFile of c_api.cc was always overridden to
be false. This disabled the functionality to hide log messages.

This commit reverts that part to enable the hiding of log messages.
2016-11-20 22:04:36 -08:00
Tony DiFranco
f11f2bd5fd add default to poisson -> max_delta_step to enable loading/saving/dumping of model (#1781) 2016-11-16 14:25:00 -08:00
Simon DENEL
58aa1129ea Fixing a few typos (#1771)
* Fixing a few typos

* Fixing a few typos
2016-11-13 15:47:52 -08:00
Morten Hustveit
8b9d9669bb Have ConsoleLogger log to stderr instead of stdout (#1714)
On Unix systems, it's common for programs to read their input from stdin, and
write their output to stdout.  Messages should be written to stderr, where they
won't corrupt a program's output, and where they can be seen by the user even
if the output is being redirected.

This is mostly a problem when XGBoost is being used from Python or from another
program.
2016-11-10 12:39:52 -08:00
wl2776
6b5a23ccd5 fix build in MSVC 2013 (#1757) 2016-11-10 12:34:30 -08:00
Tony DiFranco
2ad0948444 Tweedie Regression Post-Rebase (#1737)
* add support for tweedie regression

* added back readme line that was accidentally deleted

* fixed linting errors

* add support for tweedie regression

* added back readme line that was accidentally deleted

* fixed linting errors

* rebased with upstream master and added R example

* changed parameter name to tweedie_variance_power

* linting error fix

* refactored tweedie-nloglik metric to be more like the other parameterized metrics

* added upper and lower bound check to tweedie metric

* add support for tweedie regression

* added back readme line that was accidentally deleted

* fixed linting errors

* added upper and lower bound check to tweedie metric

* added back readme line that was accidentally deleted

* rebased with upstream master and added R example

* rebased again on top of upstream master

* linting error fix

* added upper and lower bound check to tweedie metric

* rebased with master

* lint fix

* removed whitespace at end of line 186 - elementwise_metric.cc
2016-11-05 17:02:32 -07:00
AbdealiJK
b94fcab4dc Add dump_format=json option (#1726)
* Add format to the params accepted by DumpModel

Currently, only the test format is supported when trying to dump
a model. The plan is to add more such formats like JSON which are
easy to read and/or parse by machines. And to make the interface
for this even more generic to allow other formats to be added.

Hence, we make some modifications to make these function generic
and accept a new parameter "format" which signifies the format of
the dump to be created.

* Fix typos and errors in docs

* plugin: Mention all the register macros available

Document the register macros currently available to the plugin
writers so they know what exactly can be extended using hooks.

* sparce_page_source: Use same arg name in .h and .cc

* gbm: Add JSON dump

The dump_format argument can be used to specify what type
of dump file should be created. Add functionality to dump
gblinear and gbtree into a JSON file.

The JSON file has an array, each item is a JSON object for the tree.
For gblinear:
 - The item is the bias and weights vectors
For gbtree:
 - The item is the root node. The root node has a attribute "children"
   which holds the children nodes. This happens recursively.

* core.py: Add arg dump_format for get_dump()
2016-11-04 09:55:25 -07:00
AbdealiJK
378eb7d7c8 Fix typos and messages in docs (#1723) 2016-10-30 22:52:19 -07:00
RAMitchell
ac41845d4b Add GPU accelerated tree construction plugin (#1679) 2016-10-20 20:14:47 -07:00
Liam Huang
001d8c4023 correct CalcDCG in rank_metric.cc and rank_obj.cc (#1642)
* correct CalcDCG in rank_metric.cc

DCG use log base-2, however `std::log` returns log base-e.

* correct CalcDCG in rank_obj.cc

DCG use log base-2, however `std::log` returns log base-e.

* use std::log2 instead of std::log

 make it more elegant

* use std::log2 instead of std::log

make it more elegant
2016-10-18 10:23:41 -07:00
Shengwen Yang
3b9987ca9c Fix the issue 1474 (#1615)
* Fix 1474

* Fix crash issue when saving and loading poisson model

* Rollback the wrong fix
2016-09-29 19:29:47 -07:00
Vadim Khotilovich
3efff6d052 fix for VX (#1614) 2016-09-27 15:19:20 -07:00
phoenixbai
915ac0b8fe the fix of missing value assignment for name_ variable in EvalRankList method (#1558) 2016-09-26 08:57:17 -05:00
Vadim Khotilovich
693ddb860e More robust DMatrix creation from a sparse matrix (#1606)
* [CORE] DMatrix from sparse w/ explicit #col #row; safer arg types

* [python-package] c-api change for _init_from_csr _init_from_csc

* fix spaces

* [R-package] adopt the new XGDMatrixCreateFromCSCEx interface

* [CORE] redirect old sparse creators to new ones
2016-09-25 10:01:22 -07:00
Tianqi Chen
c93c9b7ed6 [TREE] Experimental version of monotone constraint (#1516)
* [TREE] Experimental version of monotone constraint

* Allow default detection of montone option

* loose the condition of strict check

* Update gbtree.cc
2016-09-07 21:28:43 -07:00
Tianqi Chen
ecec5f7959 [CORE] Refactor cache mechanism (#1540) 2016-09-02 20:39:07 -07:00
Tianqi Chen
df38f251be Fix warnings from g++5 or higher (#1510) 2016-08-26 16:14:10 -07:00
Vadim Khotilovich
75f401481f no exception throwing within omp parallel; set nthread in Learner (#1421) 2016-07-29 10:08:03 -07:00
Shengwen Yang
7089301b62 Metrics for gamma regression (#1369)
* Add deviance metric for gamma regression

* Simplify the computation of nloglik for gamma regression

* Add a description for gamma-deviance

* Minor fix
2016-07-18 09:10:44 -05:00
anpark
0e61c514a7 fix duplicate loop over output_group when predict (#1342)
* fix sparse page source meta info empty when load from dmatrix

* fix duplicate loop over output_group when predict
2016-07-13 10:03:10 -07:00
anpark
3f32b3f0eb fix sparse page source meta info empty when load from dmatrix (#1336) 2016-07-07 21:17:35 -07:00
Shengwen Yang
77d17f6264 Add support for Gamma regression (#1258)
* Add support for Gamma regression

* Use base_score to replace the lp_bias

* Remove the lp_bias config block

* Add a demo for running gamma regression in Python

* Typo fix

* Revise the description for objective

* Add a script to generate the autoclaims dataset
2016-07-06 10:22:46 -07:00
RAMitchell
93196eb811 cmake build system (#1314)
* Changed c api to compile under MSVC

* Include functional.h header for MSVC

* Add cmake build
2016-07-02 19:07:35 -07:00
Frank
3b73824842 Fix ambiguous call to abs(c or c++). (#1308) 2016-06-29 14:28:28 -07:00
Yoshinori Nakano
7cfeb5f012 fix Dart::NormalizeTrees (#1265) 2016-06-09 15:28:24 -07:00
Yoshinori Nakano
949d1e3027 add Dart booster (#1220) 2016-06-08 14:04:01 -07:00
Shengwen Yang
e034fdf74c Fix issue #1236: cli_main crashes when dumping count:poisson model (#1253) 2016-06-07 21:52:47 -07:00
Vadim Khotilovich
9a48a40cf1 Fixes for multiple and default metric (#1239)
* fix multiple evaluation metrics

* create DefaultEvalMetric only when really necessary

* py test for #1239

* make travis happy
2016-06-04 22:17:35 -07:00
Zhongliang Li
1dde863c98 fix cli_main crashes when using count:poisson regression 2016-05-26 10:03:29 -07:00
yuanbowen
5898f1c59e [DATA] fix instance weights loading 2016-05-23 18:40:41 +08:00
Nan Zhu
c85b9012c6 [jvm-packages] xgboost4j-spark external memory (#1219)
* implement external memory support for XGBoost4J

* remove extra space

* enable external memory for prediction

* update doc
2016-05-22 14:01:28 -04:00
tqchen
d816208797 [DATA] fix async data writing 2016-05-21 18:46:36 -07:00
Vadim Khotilovich
185fef3fce fixes for lint 2016-05-15 02:35:37 -05:00
Vadim Khotilovich
ea9285dd4f methods to delete an attribute and get names of available attributes 2016-05-14 18:19:18 -05:00
Vadim Khotilovich
24e3c5773e Merge branch 'master' into seed_in_configure 2016-04-26 22:47:01 -05:00
Vadim Khotilovich
811c6ef58b obey the lint 2016-04-26 22:11:19 -05:00
Vadim Khotilovich
3e0732dea9 in Configure, set random seed only for uninitialized model 2016-04-26 02:03:22 -05:00
Vadim Khotilovich
0527b17c9d avoid collecting duplicate parameters in Booster::cfg_ 2016-04-25 22:08:53 -05:00
Vadim Khotilovich
1160d0bf25 ability to specify threshold for the error metric 2016-04-25 01:29:04 -05:00
Wojciech Migda
6a5eb47789 XGBoosterCreate api unified to use const DMatrix[] argument 2016-03-26 19:42:58 +01:00
tqchen
a2714fe052 [METHOD], add tree method option to prefer faster algo 2016-03-13 12:24:47 -07:00
tqchen
59d59a968d Fix continue training in CLI 2016-03-10 19:39:09 -08:00
tqchen
ec2fb5bc48 Fix multi-class loading 2016-03-10 19:22:26 -08:00
tqchen
96b17971ac Fix continue training in CLI 2016-03-10 12:43:25 -08:00
tqchen
86871d4be9 [JVM] Add Iterator loading API 2016-03-04 17:37:46 -08:00
tqchen
0df2ed80c8 [JVM] Make JVM Serializable 2016-03-03 21:04:02 -08:00
tqchen
e80d3db64b [DIST] Enable multiple thread and tracker, make rabit and xgboost more thread-safe by using thread local variables. 2016-03-03 20:36:14 -08:00
tqchen
ecb3a271be [PYTHON-DIST] Distributed xgboost python training API. 2016-02-29 16:54:13 -08:00