xgboost

Author	SHA1	Message	Date
Nan Zhu	6cf97b4eae	[jvm-packages] consider spark.task.cpus when controlling parallelism (#3530 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * consider spark.task.cpus when controlling parallelism * fix bug * fix conf setup * calculate requestedCores within ParallelismController * enforce spark.task.cpus = 1 * unify unit test case framework * enable spark ui	2018-07-31 06:19:45 -07:00
trivialfis	860263f814	Enable building with sanitizers. (#3525 )	2018-07-31 17:25:47 +12:00
Nan Zhu	b546321c83	[jvm-packages] the current version of xgboost does not consider missing value in prediction (#3529 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * consider missing value in prediction * handle single prediction instance * fix type conversion	2018-07-30 14:16:24 -07:00
wenduowang	3b62e75f2e	Fix bug of using list(x) function when x is string (#3432 ) * Fix bug of using list(x) function when x is string list('abcdcba') = ['a', 'b', 'c', 'd', 'c', 'b', 'a'] * Allow feature_names/feature_types to be of any type If feature_names/feature_types is iterable, e.g. tuple, list, then convert the value to list, except for string; otherwise construct a list with a single value * Delete excess whitespace * Fix whitespace to pass lint	2018-07-30 07:36:34 -07:00
jqmp	dd07c25d12	Fix typo in ElasticNet threshold function (#3527 )	2018-07-30 14:08:14 +12:00
Philip Hyunsu Cho	2bb9b9d3db	Fix typo in parameter.rst, gblinear section (#3518 )	2018-07-28 18:58:15 -07:00
Nan Zhu	b5178d3d99	[jvm-packages] a better explanation about the inconsistent issue (#3524 )	2018-07-28 17:34:39 -07:00
hlsc	5850a2558a	fix DMatrix load_row_split bug (#3431 )	2018-07-28 17:21:30 -07:00
trivialfis	8973f2cb0e	Fix building dmlc-core from xgboost. (#3522 ) Move building dmlc-core before adding DMLC_LOG_CUSTOMIZE. Fix #3520.	2018-07-28 10:35:11 -07:00
Uddeshya Singh	3363b9142e	Update faq.rst (#3521 ) Just fixing a minor typo	2018-07-28 10:34:14 -07:00
Rory Mitchell	07ff52d54c	Dynamically allocate GPU histogram memory (#3519 ) * Expand histogram memory dynamically to prevent large allocations for large tree depths (e.g. > 15) * Remove GPU memory allocation messages. These are misleading as a large number of allocations are now dynamic. * Fix appveyor R test	2018-07-28 21:22:41 +12:00
Brandon Greenwell	b5fad42da2	Issue warning when requesting bivariate plotting (#3516 )	2018-07-27 16:15:37 -07:00
Philip Hyunsu Cho	8a5209c55e	Fix model saving for 'count:possion': max_delta_step as Booster attribute (#3515 ) * Save max_delta_step as an extra attribute of Booster Fixes #3509 and #3026, where `max_delta_step` parameter gets lost during serialization. * fix lint * Use camel case for global constant * disable local variable case in clang-tidy	2018-07-27 09:55:54 -07:00
Andy Adinets	cc6a5a3666	Added finding quantiles on GPU. (#3393 ) * Added finding quantiles on GPU. - this includes datasets where weights are assigned to data rows - as the quantiles found by the new algorithm are not the same as those found by the old one, test thresholds in tests/python-gpu/test_gpu_updaters.py have been adjusted. * Adjustments and improved testing for finding quantiles on the GPU. - added C++ tests for the DeviceSketch() function - reduced one of the thresholds in test_gpu_updaters.py - adjusted the cuts found by the find_cuts_k kernel	2018-07-27 14:03:16 +12:00
Nan Zhu	e2f09db77a	[jvm-packages] minor fix for parameter name in example (#3507 )	2018-07-25 19:57:40 -07:00
Rory Mitchell	a725272e19	Correct mistake from dmatrix refactor (#3408 )	2018-07-24 15:03:36 +12:00
jqmp	e9a97e0d88	Add total_gain and total_cover importance measures (#3498 ) Add `'total_gain'` and `'total_cover'` as possible `importance_type` arguments to `Booster.get_score` in the Python package. `get_score` already accepts a `'gain'` argument, which returns each feature's average gain over all of its splits. `'total_gain'` does the same, but returns a total rather than an average. This seems more intuitively meaningful, and also matches the behavior of the R package's `xgb.importance` function. I also added an analogous `'total_cover'` command for consistency. This should resolve #3484.	2018-07-23 00:30:55 -07:00
KOLANICH	a1505de631	Added configuration for python into .editorconfig (#3494 ) * Added configuration for python into .editorconfig * Fixed forgotten change in the number of spaces	2018-07-23 00:24:10 -07:00
KOLANICH	a393d44c5d	Improved library loading a bit (#3481 ) * Improved library loading a bit * Fixed indentation. * Fixes according to the discussion * Moved the comment to a separate line. * specified exception type	2018-07-20 16:03:44 -07:00
Philip Hyunsu Cho	8e90b60c4d	Fix relpath in setup.py on Windows (#3493 ) * Fix relpath in setup.py on Windows Fixes #3480. * Use only one lib file; use 4 space indent	2018-07-20 12:28:08 -07:00
Philip Hyunsu Cho	05b089405d	Doc modernization (#3474 ) * Change doc build to reST exclusively * Rewrite Intro doc in reST; create toctree * Update parameter and contribute * Convert tutorials to reST * Convert Python tutorials to reST * Convert CLI and Julia docs to reST * Enable markdown for R vignettes * Done migrating to reST * Add guzzle_sphinx_theme to requirements * Add breathe to requirements * Fix search bar * Add link to user forum	2018-07-19 14:22:16 -07:00
Yanbo Liang	c004cea788	Expose setCustomObj & setCustomEval for XGBoostClassifier & XGBoostRegressor. (#3486 )	2018-07-17 21:16:51 -07:00
KOLANICH	b6dcbf0e07	Added .editorconfig (#3478 )	2018-07-17 20:05:55 -07:00
Rory Mitchell	0f145a0365	Resolve GPU bug on large files (#3472 ) Remove calls to thrust copy, fix indexing bug	2018-07-16 20:43:45 +12:00
Rory Mitchell	1b59316444	Updates for GPU CI tests (#3467 ) * Fail GPU CI after test failure * Fix GPU linear tests * Reduced number of GPU tests to speed up CI * Remove static allocations of device memory * Resolve illegal memory access for updater_fast_hist.cc * Fix broken r tests dependency * Update python install documentation for GPU	2018-07-16 18:05:53 +12:00
Henry Gouk	a13e29ece1	Add LASSO (#3429 ) * Allow multiple split constraints * Replace RidgePenalty with ElasticNet * Add test for checking Ridge, LASSO, and Elastic Net are implemented	2018-07-15 16:38:26 +12:00
Yanbo Liang	2f8764955c	[JVM-packages] Support single instance prediction. (#3464 ) * Support single instance prediction. * Address comments.	2018-07-12 14:17:53 -07:00
Thejaswi	2200939416	Upgrading to NCCL2 (#3404 ) * Upgrading to NCCL2 * Part - II of NCCL2 upgradation - Doc updates to build with nccl2 - Dockerfile.gpu update for a correct CI build with nccl2 - Updated FindNccl package to have env-var NCCL_ROOT to take precedence * Upgrading to v9.2 for CI workflow, since it has the nccl2 binaries available * Added NCCL2 license + copy the nccl binaries into /usr location for the FindNccl module to find * Set LD_LIBRARY_PATH variable to pick nccl2 binary at runtime * Need the nccl2 library download instructions inside Dockerfile.release as well * Use NCCL2 as a static library	2018-07-10 00:42:15 -07:00
Thejaswi	a6331925d2	Upgrade cuda version to 9.2 for CI workflows (#3460 ) - Needed by the issue #3404 - as v9.1 doesn't have a nccl2 release	2018-07-08 23:04:51 -07:00
Philip Hyunsu Cho	b40959042c	Document 0.72.1 version (#3458 )	2018-07-08 15:42:09 -07:00
kodonnell	6bed54ac39	python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour (#3445 ) * python sklearn api: defaulting to best_ntree_limit if defined, otherwise current behaviour * Fix whitespace	2018-07-08 14:35:52 -07:00
ngoyal2707	cb017d0c9a	[jvm-packages] removed old group_data from spark api (#3451 )	2018-07-07 22:21:01 -07:00
Nan Zhu	aa90e5c6ce	[jvm-packages] disable booster setup for xgboost4j-spark (#3456 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * disable booster setup in spark * check in parameter conversion * fix compilation issue * update exception type	2018-07-07 21:57:24 -07:00
Philip Hyunsu Cho	66e74d2223	Fix get_uint_info() (#3442 ) * Add regression test	2018-07-05 20:06:59 -07:00
Philip Hyunsu Cho	48d6e68690	Add callback interface to re-direct console output (#3438 ) * Add callback interface to re-direct console output * Exempt TrackerLogger from custom logging * Fix lint	2018-07-05 11:32:30 -07:00
Philip Hyunsu Cho	45bf4fbffb	Add a notice for binary PyPI wheel (#3443 )	2018-07-05 08:28:43 -07:00
Tianqi Chen	01aff45f26	Update README.md	2018-07-04 13:09:32 -07:00
Tianqi Chen	e62639c59b	[DOCS] Update link to readme (#3437 )	2018-07-04 12:24:33 -07:00
Yanbo Liang	aec6299c49	[jvm-packages] Expose nativeBooster for XGBoostClassificationModel and XGBoostRegressionModel. (#3428 )	2018-07-01 15:06:16 -07:00
Nikita Titov	295252249e	fixed MinGW missed dll (#3430 )	2018-07-01 16:43:33 +00:00
liuliang01	0cf88d036f	Add qid like ranklib format (#2749 ) * add qid for https://github.com/dmlc/xgboost/issues/2748 * change names * change spaces * change qid to bst_uint type * change qid type to size_t * change qid first to SIZE_MAX * change qid type from size_t to uint64_t * update dmlc-core * fix qids name error * fix group_ptr_ error * Style fix * Add qid handling logic to SparsePage * New MetaInfo format + backward compatibility fix Old MetaInfo format (1.0) doesn't contain qid field. We still want to be able to read from MetaInfo files saved in old format. Also, define a new format (2.0) that contains the qid field. This way, we can distinguish files that contain qid and those that do not. * Update MetaInfo test * Simply group assignment logic * Explicitly set qid=nullptr in NativeDataIter NativeDataIter's callback does not support qid field. Users of NativeDataIter will need to call setGroup() function separately to set group information. * Save qids_ in SaveBinary() * Upgrade dmlc-core submodule * Add a test for reading qid * Add contributor * Check the size of qids_ * Document qid format	2018-06-30 20:24:03 +00:00
Oliver Laslett	18813a26ab	allow arbitrary cross validation fold indices (#3353 ) * allow arbitrary cross validation fold indices - use training indices passed to `folds` parameter in `training.cv` - update doc string * add tests for arbitrary fold indices	2018-06-30 19:23:49 +00:00
Mike Liu	594bcea83e	Save and load model in sklearn API (#3192 ) * Add (load\|save)_model to XGBModel * Add docstring * Fix docstring * Fix mixed use of space and tab * Add a test * Fix Flake8 style errors	2018-06-30 19:21:49 +00:00
Rory Mitchell	24fde92660	Build universal wheels using GPU CI (#3424 )	2018-06-29 13:45:24 +00:00
Yun Ni	30d10ab035	Convert handle == nullptr from SegFault to user-friendly error. (#3021 ) * Convert SegFault to user-friendly error. * Apply the change to DMatrix API as well	2018-06-29 06:30:26 +00:00
cinqS	8bec8d5e9a	Better doc for save_model() / load_model() (#3143 ) Be clear that they do not save Python-specific attributes	2018-06-29 04:24:33 +00:00
pdesahb	12e34f32e2	Fix tweedie handling of base_score (#3295 ) * fix tweedie margin calculations * add entry to contributors	2018-06-28 15:43:05 +00:00
Henry Gouk	64b8cffde3	Refactor of FastHistMaker to allow for custom regularisation methods (#3335 ) * Refactor to allow for custom regularisation methods * Implement compositional SplitEvaluator framework * Fixed segfault when no monotone_constraints are supplied. * Change pid to parentID * test_monotone_constraints.py now passes * Refactor ColMaker and DistColMaker to use SplitEvaluator * Performance optimisation when no monotone_constraints specified * Fix linter messages * Fix a few more linter errors * Update the amalgamation * Add bounds check * Add check for leaf node * Fix linter error in param.h * Fix clang-tidy errors on CI * Fix incorrect function name * Fix clang-tidy error in updater_fast_hist.cc * Enable SSE2 for Win32 R MinGW Addresses https://github.com/dmlc/xgboost/pull/3335#issuecomment-400535752 * Add contributor	2018-06-28 07:37:25 +00:00
Philip Hyunsu Cho	cafc621914	Do not unzip google test archive if exists (#3416 )	2018-06-28 04:10:39 +00:00
Philip Hyunsu Cho	e2743548ed	Fix wget for google tests in tests (#3414 ) CI tests were failing because wget prompts "the user" for a response whenever the google test archive is already on the disk. Fix: Use `-nc` option to skip download when the archive already exists	2018-06-27 22:12:56 +00:00

... 2 3 4 5 6 ...

3506 Commits