Commit Graph

42 Commits

Author SHA1 Message Date
Jason E. Aten, Ph.D
afa6e086cc Clarify meaning of training parameter in XGBoosterPredict() (#5604)
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-05-04 09:08:57 -07:00
Philip Hyunsu Cho
ef26bc45bf Hide C++ symbols in libxgboost.so when building Python wheel (#5590)
* Hide C++ symbols in libxgboost.so when building Python wheel

* Update Jenkinsfile

* Add test

* Upgrade rabit

* Add setup.py option.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-04-24 13:32:05 -07:00
Jiaming Yuan
f2b8cd2922 Add number of columns to native data iterator. (#5202)
* Change native data iter into an adapter.
2020-02-25 23:42:01 +08:00
Jiaming Yuan
0110754a76 Remove update prediction cache from predictors. (#5312)
Move this function into gbtree, and uses only updater for doing so. As now the predictor knows exactly how many trees to predict, there's no need for it to update the prediction cache.
2020-02-17 11:35:47 +08:00
Kodi Arfer
f100b8d878 [Breaking] Don't drop trees during DART prediction by default (#5115)
* Simplify DropTrees calling logic

* Add `training` parameter for prediction method.

* [Breaking]: Add `training` to C API.

* Change for R and Python custom objective.

* Correct comment.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-01-13 21:48:30 +08:00
Jiaming Yuan
27b3646d29 Tests and documents for new JSON routines. (#5120) 2019-12-18 08:44:27 +08:00
Jiaming Yuan
3136185bc5 JSON configuration IO. (#5111)
* Add saving/loading JSON configuration.
* Implement Python pickle interface with new IO routines.
* Basic tests for training continuation.
2019-12-15 17:31:53 +08:00
Jiaming Yuan
208ab3b1ff Model IO in JSON. (#5110) 2019-12-11 11:20:40 +08:00
Jiaming Yuan
5620322a48 [Breaking] Add global versioning. (#4936)
* Use CMake config file for representing version.

* Generate c and Python version file with CMake.

The generated file is written into source tree.  But unless XGBoost upgrades
its version, there will be no actual modification.  This retains compatibility
with Makefiles for R.

* Add XGBoost version the DMatrix binaries.
* Simplify prefetch detection in CMakeLists.txt
2019-10-22 23:27:26 -04:00
Jiaming Yuan
d669ea1eaa Deprecate set group (#4864)
* Convert jvm package and R package.

* Restore for compatibility.
2019-09-17 21:26:54 -04:00
Bryan Woods
278562db13 Add support for cross-validation using query ID (#4474)
* adding support for matrix slicing with query ID for cross-validation

* hail mary test of unrar installation for windows tests

* trying to modify tests to run in Github CI

* Remove dependency on wget and unrar

* Save error log from R test

* Relax assertion in test_training

* Use int instead of bool in C function interface

* Revise R interface

* Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility
2019-05-23 10:45:02 -07:00
Jiaming Yuan
207f058711 Refactor CMake scripts. (#4323)
* Refactor CMake scripts.

* Remove CMake CUDA wrapper.
* Bump CMake version for CUDA.
* Use CMake to handle Doxygen.
* Split up CMakeList.
* Export install target.
* Use modern CMake.
* Remove build.sh
* Workaround for gpu_hist test.
* Use cmake 3.12.

* Revert machine.conf.

* Move CLI test to gpu.

* Small cleanup.

* Support using XGBoost as submodule.

* Fix windows

* Fix cpp tests on Windows

* Remove duplicated find_package.
2019-04-15 10:08:12 -07:00
Jiaming Yuan
fdcae024e7 Remove deprecated C APIs. (#4266) 2019-03-17 16:42:44 +08:00
Philip Hyunsu Cho
2aaae2e7bb Fix #4163: always copy sliced data (#4165)
* Revert "Accept numpy array view. (#4147)"

This reverts commit a985a99cf0.

* Fix #4163: always copy sliced data

* Remove print() from the test; check shape equality

* Check if 'base' attribute exists

* Fix lint

* Address reviewer comment

* Fix lint
2019-02-20 14:46:34 -08:00
Jiaming Yuan
a985a99cf0 Accept numpy array view. (#4147)
* Accept array view (slice) in metainfo.
2019-02-18 22:21:34 +08:00
Jiaming Yuan
2e618af743 Fix cpplint. (#4157)
* Add comment after #endif.
* Add missing headers.
2019-02-18 00:16:29 +08:00
Philip Hyunsu Cho
48d6e68690 Add callback interface to re-direct console output (#3438)
* Add callback interface to re-direct console output

* Exempt TrackerLogger from custom logging

* Fix lint
2018-07-05 11:32:30 -07:00
PSEUDOTENSOR / Jonathan McKinney
9ac163d0bb Allow import via python datatable. (#3272)
* Allow import via python datatable.

* Write unit tests

* Refactor dt API functions

* Refactor python code

* Lint fixes

* Address review comments
2018-06-20 13:16:18 -07:00
Rory Mitchell
ccf80703ef Clang-tidy static analysis (#3222)
* Clang-tidy static analysis

* Modernise checks

* Google coding standard checks

* Identifier renaming according to Google style
2018-04-19 18:57:13 +12:00
Will Storey
c85995952f Allow compilation with -Werror=strict-prototypes (#3183) 2018-03-18 12:25:42 +13:00
yskn67
3dcf966bc3 Fix XGDMatrixFree argument type (#2898) 2017-11-23 10:49:05 -08:00
PSEUDOTENSOR / Jonathan McKinney
6b375f6ad8 Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530)
* Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.
2017-07-21 14:43:17 +12:00
Artem Krylysov
ed8da45f9d Fix C API header compatibility with C compilers (#2369) 2017-06-02 10:14:30 -07:00
Vadim Khotilovich
b52db87d5c adding feature contributions to R and gblinear (#2295)
* [gblinear] add features contribution prediction; fix DumpModel bug

* [gbtree] minor changes to PredContrib

* [R] add feature contribution prediction to R

* [R] bump up version; update NEWS

* [gblinear] fix the base_margin issue; fixes #1969

* [R] list of matrices as output of multiclass feature contributions

* [gblinear] make order of DumpModel coefficients consistent: group index changes the fastest
2017-05-21 07:41:51 -04:00
Vadim Khotilovich
c66ca79221 [R] native routines registration (#2290)
* [R] add native routines registration

* c_api.h needs to include <cstdint> since it uses fixed width integer types

* [R] use registered native routines from R code

* [R] bump version; add info on native routine registration to the contributors guide

* make lint happy
2017-05-14 11:00:46 -07:00
Maurus Cuelenaere
6bd1869026 Add prediction of feature contributions (#2003)
* Add prediction of feature contributions

This implements the idea described at http://blog.datadive.net/interpreting-random-forests/
which tries to give insight in how a prediction is composed of its feature contributions
and a bias.

* Support multi-class models

* Calculate learning_rate per-tree instead of using the one from the first tree

* Do not rely on node.base_weight * learning_rate having the same value as the node mean value (aka leaf value, if it were a leaf); instead calculate them (lazily) on-the-fly

* Add simple test for contributions feature

* Check against param.num_nodes instead of checking for non-zero length

* Loop over all roots instead of only the first
2017-05-14 00:58:10 -05:00
AbdealiJK
6f16f0ef58 Use bst_float consistently throughout (#1824)
* Fix various typos

* Add override to functions that are overridden

gcc gives warnings about functions that are being overridden by not
being marked as oveirridden. This fixes it.

* Use bst_float consistently

Use bst_float for all the variables that involve weight,
leaf value, gradient, hessian, gain, loss_chg, predictions,
base_margin, feature values.

In some cases, when due to additions and so on the value can
take a larger value, double is used.

This ensures that type conversions are minimal and reduces loss of
precision.
2016-11-30 10:02:10 -08:00
AbdealiJK
b94fcab4dc Add dump_format=json option (#1726)
* Add format to the params accepted by DumpModel

Currently, only the test format is supported when trying to dump
a model. The plan is to add more such formats like JSON which are
easy to read and/or parse by machines. And to make the interface
for this even more generic to allow other formats to be added.

Hence, we make some modifications to make these function generic
and accept a new parameter "format" which signifies the format of
the dump to be created.

* Fix typos and errors in docs

* plugin: Mention all the register macros available

Document the register macros currently available to the plugin
writers so they know what exactly can be extended using hooks.

* sparce_page_source: Use same arg name in .h and .cc

* gbm: Add JSON dump

The dump_format argument can be used to specify what type
of dump file should be created. Add functionality to dump
gblinear and gbtree into a JSON file.

The JSON file has an array, each item is a JSON object for the tree.
For gblinear:
 - The item is the bias and weights vectors
For gbtree:
 - The item is the root node. The root node has a attribute "children"
   which holds the children nodes. This happens recursively.

* core.py: Add arg dump_format for get_dump()
2016-11-04 09:55:25 -07:00
Adam Pocock
445029bb82 [jvm-packages] XGBoost4j Windows fixes (#1639)
* Changes for Mingw64 compilation to ensure long is a consistent size.

Mainly impacts the Java API which would not compile, but there may be
silent errors on Windows with large datasets before this patch (as long
is 32-bits when compiled with mingw64 even in 64-bit mode).

* Adding ifdefs to ensure it still compiles on MacOS

* Makefile and create_jni.bat changes for Windows.

* Switching XGDMatrixCreateFromCSREx JNI call to use size_t cast

* Fixing lint error, adding profile switching to jvm-packages build to make create-jni.bat get called, adding myself to Contributors.Md
2016-10-18 08:35:25 -04:00
Vadim Khotilovich
693ddb860e More robust DMatrix creation from a sparse matrix (#1606)
* [CORE] DMatrix from sparse w/ explicit #col #row; safer arg types

* [python-package] c-api change for _init_from_csr _init_from_csc

* fix spaces

* [R-package] adopt the new XGDMatrixCreateFromCSCEx interface

* [CORE] redirect old sparse creators to new ones
2016-09-25 10:01:22 -07:00
RAMitchell
93196eb811 cmake build system (#1314)
* Changed c api to compile under MSVC

* Include functional.h header for MSVC

* Add cmake build
2016-07-02 19:07:35 -07:00
Vadim Khotilovich
26b36714ea doxygen suggested fix 2016-05-15 03:05:19 -05:00
Vadim Khotilovich
ea9285dd4f methods to delete an attribute and get names of available attributes 2016-05-14 18:19:18 -05:00
Wojciech Migda
6a5eb47789 XGBoosterCreate api unified to use const DMatrix[] argument 2016-03-26 19:42:58 +01:00
tqchen
86871d4be9 [JVM] Add Iterator loading API 2016-03-04 17:37:46 -08:00
tqchen
ecb3a271be [PYTHON-DIST] Distributed xgboost python training API. 2016-02-29 16:54:13 -08:00
tqchen
4a16b729fc [PYTHON] Simplify training logic, update rabit lib 2016-02-28 13:20:55 -08:00
tqchen
634db18a0f [TRAVIS] cleanup travis script 2016-01-16 10:25:12 -08:00
tqchen
d75e3ed05d [LIBXGBOOST] pass demo running. 2016-01-16 10:24:01 -08:00
tqchen
cee148ed64 [CLI] initial refactor of CLI 2016-01-16 10:24:01 -08:00
tqchen
9042b9e2c7 [GBM] Finish migrate all gbms 2016-01-16 10:24:01 -08:00
tqchen
d530e0c14f [REFACTOR] cleanup structure 2016-01-16 10:24:00 -08:00