147 Commits

Author SHA1 Message Date
Simon DENEL
7078c41dad Changing omp_get_num_threads to omp_get_max_threads (#1831)
* Updating dmlc-core

* Changing omp_get_num_threads to omp_get_max_threads
2016-12-04 11:26:45 -08:00
Vadim Khotilovich
a44032d095 [CORE] The update process for a tree model, and its application to feature importance (#1670)
* [CORE] allow updating trees in an existing model

* [CORE] in refresh updater, allow keeping old leaf values and update stats only

* [R-package] xgb.train mod to allow updating trees in an existing model

* [R-package] added check for nrounds when is_update

* [CORE] merge parameter declaration changes; unify their code style

* [CORE] move the update-process trees initialization to Configure; rename default process_type to 'default'; fix the trees and trees_to_update sizes comparison check

* [R-package] unit tests for the update process type

* [DOC] documentation for process_type parameter; improved docs for updater, Gamma and Tweedie; added some parameter aliases; metrics indentation and some were non-documented

* fix my sloppy merge conflict resolutions

* [CORE] add a TreeProcessType enum

* whitespace fix
2016-12-04 09:33:52 -08:00
AbdealiJK
6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
RAMitchell
be2f28ec08 Update build instructions, improve memory usage (#1811) 2016-11-25 09:43:22 -08:00
Simon DENEL
58aa1129ea Fixing a few typos (#1771)
* Fixing a few typos

* Fixing a few typos
2016-11-13 15:47:52 -08:00
AbdealiJK
b94fcab4dc Add dump_format=json option (#1726)
* Add format to the params accepted by DumpModel

Currently, only the test format is supported when trying to dump
a model. The plan is to add more such formats like JSON which are
easy to read and/or parse by machines. And to make the interface
for this even more generic to allow other formats to be added.

Hence, we make some modifications to make these function generic
and accept a new parameter "format" which signifies the format of
the dump to be created.

* Fix typos and errors in docs

* plugin: Mention all the register macros available

Document the register macros currently available to the plugin
writers so they know what exactly can be extended using hooks.

* sparce_page_source: Use same arg name in .h and .cc

* gbm: Add JSON dump

The dump_format argument can be used to specify what type
of dump file should be created. Add functionality to dump
gblinear and gbtree into a JSON file.

The JSON file has an array, each item is a JSON object for the tree.
For gblinear:
 - The item is the bias and weights vectors
For gbtree:
 - The item is the root node. The root node has a attribute "children"
   which holds the children nodes. This happens recursively.

* core.py: Add arg dump_format for get_dump()
2016-11-04 09:55:25 -07:00
RAMitchell
ac41845d4b Add GPU accelerated tree construction plugin (#1679) 2016-10-20 20:14:47 -07:00
Tianqi Chen
c93c9b7ed6 [TREE] Experimental version of monotone constraint (#1516)
* [TREE] Experimental version of monotone constraint

* Allow default detection of montone option

* loose the condition of strict check

* Update gbtree.cc
2016-09-07 21:28:43 -07:00
Vadim Khotilovich
75f401481f no exception throwing within omp parallel; set nthread in Learner (#1421) 2016-07-29 10:08:03 -07:00
Frank
3b73824842 Fix ambiguous call to abs(c or c++). (#1308) 2016-06-29 14:28:28 -07:00
tqchen
ecb3a271be [PYTHON-DIST] Distributed xgboost python training API. 2016-02-29 16:54:13 -08:00
tqchen
413f119c7e Update dmlc-core 2016-02-10 13:11:21 -08:00
tqchen
63c4ad7617 [APPROX] Make global proposal default, add group ptr solution 2016-02-10 11:19:10 -08:00
tqchen
ce4d59ed69 [TREE] Enable global proposal for faster speed 2016-02-10 11:19:10 -08:00
tqchen
2f2080a337 [TREE] Remove gap constraint, make tree construction more robust 2016-02-10 11:17:54 -08:00
tqchen
a500fbc9b0 [TREE] switch to two pass 2016-02-10 11:17:17 -08:00
tqchen
523afcbcd2 [TREE] Cleanup some functions, add utility function for two pass 2016-02-10 11:17:17 -08:00
tqchen
52227a8920 [TREE] Refactor histmaker 2016-02-10 11:17:17 -08:00
tqchen
88447ca32e [MEM] Add rowset struct to save memory with billion level rows 2016-02-10 11:17:17 -08:00
samuel-liyi
d3540aacc5 change the formula of fsplit value 2016-02-08 15:00:04 +08:00
tqchen
1495a43cea [R] make all customizations to meet strict standard of cran 2016-01-16 10:25:12 -08:00
tqchen
d75e3ed05d [LIBXGBOOST] pass demo running. 2016-01-16 10:24:01 -08:00
tqchen
4b4b36d047 [GBM] remove need to explicit InitModel, rename save/load 2016-01-16 10:24:01 -08:00
tqchen
e4567bbc47 [REFACTOR] Add alias, allow missing variables, init gbm interface 2016-01-16 10:24:01 -08:00
tqchen
d4677b6561 [TREE] finish move of updater 2016-01-16 10:24:01 -08:00
tqchen
4adc4cf0b9 [TREE] Move the files to target refactor location 2016-01-16 10:24:01 -08:00
tqchen
3128e1705b [TREE] Refactor colmaker 2016-01-16 10:24:01 -08:00
tqchen
20043f63a6 [TREE] Move colmaker 2016-01-16 10:24:01 -08:00
tqchen
c8ccb61b9e [TREE] Enable updater registry 2016-01-16 10:24:01 -08:00
tqchen
a62a66d545 [TREE] Finalize regression tree refactor 2016-01-16 10:24:01 -08:00
tqchen
d530e0c14f [REFACTOR] cleanup structure 2016-01-16 10:24:00 -08:00
Julian Quick
f51e1893fe fix minor typo 2016-01-01 20:03:45 -08:00
Vadim Khotilovich
c70022e6c4 spelling, wording, and doc fixes in c++ code
I was reading through the code and fixing some things in the comments.
Only a few trivial actual code changes were made to make things more
readable.
2015-12-12 21:40:12 -06:00
Tianqi Chen
fd8439ffbc Update param.h
enforce parallel option to 0 for now for stable result
2015-10-19 08:59:06 -07:00
tqchen
0162bb7034 lint half way 2015-07-03 18:31:52 -07:00
tqchen
e5dd894960 add a indicator opt 2015-06-02 11:38:06 -07:00
tqchen
09a841f810 auto turn on optimization 2015-05-15 23:54:34 -07:00
tqchen
792cff5abc checkin some micro optimization 2015-05-15 23:54:03 -07:00
tqchen
e63faf0e85 minor shadow fix 2015-04-27 22:52:19 -07:00
Tianqi Chen
84515cd2a8 fix python windows installation problem, enable mingw compile, but seems mingw dll was not fast in loading 2015-04-25 15:30:42 -07:00
Jianfeng Zhu
78907ca08d Update updater.h
Fix minor type
2015-04-23 11:44:47 +08:00
tqchen
0461231d3d more capacity for base 2015-04-20 16:21:55 +00:00
tqchen
22abf4e295 need more check 2015-04-16 12:34:39 -07:00
tqchen
91a7a5f2e2 add small boundary checking 2015-04-10 10:55:42 -07:00
tqchen
36dcb061a8 larger boundary in edge case 2015-04-06 13:42:43 -07:00
tqchen
23e46b7fa5 add max_delta_step 2015-03-26 09:47:16 -07:00
Tianqi Chen
2159d18f0b Update param.h 2015-03-13 23:23:23 -07:00
tqchen
12528c535a fix 2015-03-11 11:22:51 -07:00
Tianqi Chen
d5303af068 fix vs warnings 2015-03-09 22:37:08 -07:00
tqchen
e79840e620 fix wrapper checkNAN 2015-03-08 09:52:59 -07:00