xgboost

Author	SHA1	Message	Date
Jason E. Aten, Ph.D	afa6e086cc	Clarify meaning of `training` parameter in XGBoosterPredict() (#5604 ) Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-05-04 09:08:57 -07:00
Philip Hyunsu Cho	ef26bc45bf	Hide C++ symbols in libxgboost.so when building Python wheel (#5590 ) * Hide C++ symbols in libxgboost.so when building Python wheel * Update Jenkinsfile * Add test * Upgrade rabit * Add setup.py option. Co-authored-by: fis <jm.yuan@outlook.com>	2020-04-24 13:32:05 -07:00
Jiaming Yuan	f2b8cd2922	Add number of columns to native data iterator. (#5202 ) * Change native data iter into an adapter.	2020-02-25 23:42:01 +08:00
Jiaming Yuan	0110754a76	Remove update prediction cache from predictors. (#5312 ) Move this function into gbtree, and uses only updater for doing so. As now the predictor knows exactly how many trees to predict, there's no need for it to update the prediction cache.	2020-02-17 11:35:47 +08:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	27b3646d29	Tests and documents for new JSON routines. (#5120 )	2019-12-18 08:44:27 +08:00
Jiaming Yuan	3136185bc5	JSON configuration IO. (#5111 ) * Add saving/loading JSON configuration. * Implement Python pickle interface with new IO routines. * Basic tests for training continuation.	2019-12-15 17:31:53 +08:00
Jiaming Yuan	208ab3b1ff	Model IO in JSON. (#5110 )	2019-12-11 11:20:40 +08:00
Jiaming Yuan	5620322a48	[Breaking] Add global versioning. (#4936 ) * Use CMake config file for representing version. * Generate c and Python version file with CMake. The generated file is written into source tree. But unless XGBoost upgrades its version, there will be no actual modification. This retains compatibility with Makefiles for R. * Add XGBoost version the DMatrix binaries. * Simplify prefetch detection in CMakeLists.txt	2019-10-22 23:27:26 -04:00
Jiaming Yuan	d669ea1eaa	Deprecate set group (#4864 ) * Convert jvm package and R package. * Restore for compatibility.	2019-09-17 21:26:54 -04:00
Bryan Woods	278562db13	Add support for cross-validation using query ID (#4474 ) * adding support for matrix slicing with query ID for cross-validation * hail mary test of unrar installation for windows tests * trying to modify tests to run in Github CI * Remove dependency on wget and unrar * Save error log from R test * Relax assertion in test_training * Use int instead of bool in C function interface * Revise R interface * Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility	2019-05-23 10:45:02 -07:00
Jiaming Yuan	207f058711	Refactor CMake scripts. (#4323 ) * Refactor CMake scripts. * Remove CMake CUDA wrapper. * Bump CMake version for CUDA. * Use CMake to handle Doxygen. * Split up CMakeList. * Export install target. * Use modern CMake. * Remove build.sh * Workaround for gpu_hist test. * Use cmake 3.12. * Revert machine.conf. * Move CLI test to gpu. * Small cleanup. * Support using XGBoost as submodule. * Fix windows * Fix cpp tests on Windows * Remove duplicated find_package.	2019-04-15 10:08:12 -07:00
Jiaming Yuan	fdcae024e7	Remove deprecated C APIs. (#4266 )	2019-03-17 16:42:44 +08:00
Philip Hyunsu Cho	2aaae2e7bb	Fix #4163 : always copy sliced data (#4165 ) * Revert "Accept numpy array view. (#4147)" This reverts commit `a985a99cf0`. * Fix #4163: always copy sliced data * Remove print() from the test; check shape equality * Check if 'base' attribute exists * Fix lint * Address reviewer comment * Fix lint	2019-02-20 14:46:34 -08:00
Jiaming Yuan	a985a99cf0	Accept numpy array view. (#4147 ) * Accept array view (slice) in metainfo.	2019-02-18 22:21:34 +08:00
Jiaming Yuan	2e618af743	Fix cpplint. (#4157 ) * Add comment after #endif. * Add missing headers.	2019-02-18 00:16:29 +08:00
Philip Hyunsu Cho	48d6e68690	Add callback interface to re-direct console output (#3438 ) * Add callback interface to re-direct console output * Exempt TrackerLogger from custom logging * Fix lint	2018-07-05 11:32:30 -07:00
PSEUDOTENSOR / Jonathan McKinney	9ac163d0bb	Allow import via python datatable. (#3272 ) * Allow import via python datatable. * Write unit tests * Refactor dt API functions * Refactor python code * Lint fixes * Address review comments	2018-06-20 13:16:18 -07:00
Rory Mitchell	ccf80703ef	Clang-tidy static analysis (#3222 ) * Clang-tidy static analysis * Modernise checks * Google coding standard checks * Identifier renaming according to Google style	2018-04-19 18:57:13 +12:00
Will Storey	c85995952f	Allow compilation with -Werror=strict-prototypes (#3183 )	2018-03-18 12:25:42 +13:00
yskn67	3dcf966bc3	Fix XGDMatrixFree argument type (#2898 )	2017-11-23 10:49:05 -08:00
PSEUDOTENSOR / Jonathan McKinney	6b375f6ad8	Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation (#2530 ) * Multi-threaded XGDMatrixCreateFromMat for faster DMatrix creation from numpy arrays for python interface.	2017-07-21 14:43:17 +12:00
Artem Krylysov	ed8da45f9d	Fix C API header compatibility with C compilers (#2369 )	2017-06-02 10:14:30 -07:00
Vadim Khotilovich	b52db87d5c	adding feature contributions to R and gblinear (#2295 ) * [gblinear] add features contribution prediction; fix DumpModel bug * [gbtree] minor changes to PredContrib * [R] add feature contribution prediction to R * [R] bump up version; update NEWS * [gblinear] fix the base_margin issue; fixes #1969 * [R] list of matrices as output of multiclass feature contributions * [gblinear] make order of DumpModel coefficients consistent: group index changes the fastest	2017-05-21 07:41:51 -04:00
Vadim Khotilovich	c66ca79221	[R] native routines registration (#2290 ) * [R] add native routines registration * c_api.h needs to include <cstdint> since it uses fixed width integer types * [R] use registered native routines from R code * [R] bump version; add info on native routine registration to the contributors guide * make lint happy	2017-05-14 11:00:46 -07:00
Maurus Cuelenaere	6bd1869026	Add prediction of feature contributions (#2003 ) * Add prediction of feature contributions This implements the idea described at http://blog.datadive.net/interpreting-random-forests/ which tries to give insight in how a prediction is composed of its feature contributions and a bias. * Support multi-class models * Calculate learning_rate per-tree instead of using the one from the first tree * Do not rely on node.base_weight * learning_rate having the same value as the node mean value (aka leaf value, if it were a leaf); instead calculate them (lazily) on-the-fly * Add simple test for contributions feature * Check against param.num_nodes instead of checking for non-zero length * Loop over all roots instead of only the first	2017-05-14 00:58:10 -05:00
AbdealiJK	6f16f0ef58	Use bst_float consistently throughout (#1824 ) * Fix various typos * Add override to functions that are overridden gcc gives warnings about functions that are being overridden by not being marked as oveirridden. This fixes it. * Use bst_float consistently Use bst_float for all the variables that involve weight, leaf value, gradient, hessian, gain, loss_chg, predictions, base_margin, feature values. In some cases, when due to additions and so on the value can take a larger value, double is used. This ensures that type conversions are minimal and reduces loss of precision.	2016-11-30 10:02:10 -08:00
AbdealiJK	b94fcab4dc	Add dump_format=json option (#1726 ) * Add format to the params accepted by DumpModel Currently, only the test format is supported when trying to dump a model. The plan is to add more such formats like JSON which are easy to read and/or parse by machines. And to make the interface for this even more generic to allow other formats to be added. Hence, we make some modifications to make these function generic and accept a new parameter "format" which signifies the format of the dump to be created. * Fix typos and errors in docs * plugin: Mention all the register macros available Document the register macros currently available to the plugin writers so they know what exactly can be extended using hooks. * sparce_page_source: Use same arg name in .h and .cc * gbm: Add JSON dump The dump_format argument can be used to specify what type of dump file should be created. Add functionality to dump gblinear and gbtree into a JSON file. The JSON file has an array, each item is a JSON object for the tree. For gblinear: - The item is the bias and weights vectors For gbtree: - The item is the root node. The root node has a attribute "children" which holds the children nodes. This happens recursively. * core.py: Add arg dump_format for get_dump()	2016-11-04 09:55:25 -07:00
Adam Pocock	445029bb82	[jvm-packages] XGBoost4j Windows fixes (#1639 ) * Changes for Mingw64 compilation to ensure long is a consistent size. Mainly impacts the Java API which would not compile, but there may be silent errors on Windows with large datasets before this patch (as long is 32-bits when compiled with mingw64 even in 64-bit mode). * Adding ifdefs to ensure it still compiles on MacOS * Makefile and create_jni.bat changes for Windows. * Switching XGDMatrixCreateFromCSREx JNI call to use size_t cast * Fixing lint error, adding profile switching to jvm-packages build to make create-jni.bat get called, adding myself to Contributors.Md	2016-10-18 08:35:25 -04:00
Vadim Khotilovich	693ddb860e	More robust DMatrix creation from a sparse matrix (#1606 ) * [CORE] DMatrix from sparse w/ explicit #col #row; safer arg types * [python-package] c-api change for _init_from_csr _init_from_csc * fix spaces * [R-package] adopt the new XGDMatrixCreateFromCSCEx interface * [CORE] redirect old sparse creators to new ones	2016-09-25 10:01:22 -07:00
RAMitchell	93196eb811	cmake build system (#1314 ) * Changed c api to compile under MSVC * Include functional.h header for MSVC * Add cmake build	2016-07-02 19:07:35 -07:00
Vadim Khotilovich	26b36714ea	doxygen suggested fix	2016-05-15 03:05:19 -05:00
Vadim Khotilovich	ea9285dd4f	methods to delete an attribute and get names of available attributes	2016-05-14 18:19:18 -05:00
Wojciech Migda	6a5eb47789	XGBoosterCreate api unified to use const DMatrix[] argument	2016-03-26 19:42:58 +01:00
tqchen	86871d4be9	[JVM] Add Iterator loading API	2016-03-04 17:37:46 -08:00
tqchen	ecb3a271be	[PYTHON-DIST] Distributed xgboost python training API.	2016-02-29 16:54:13 -08:00
tqchen	4a16b729fc	[PYTHON] Simplify training logic, update rabit lib	2016-02-28 13:20:55 -08:00
tqchen	634db18a0f	[TRAVIS] cleanup travis script	2016-01-16 10:25:12 -08:00
tqchen	d75e3ed05d	[LIBXGBOOST] pass demo running.	2016-01-16 10:24:01 -08:00
tqchen	cee148ed64	[CLI] initial refactor of CLI	2016-01-16 10:24:01 -08:00
tqchen	9042b9e2c7	[GBM] Finish migrate all gbms	2016-01-16 10:24:01 -08:00
tqchen	d530e0c14f	[REFACTOR] cleanup structure	2016-01-16 10:24:00 -08:00

42 Commits