928 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
109473dae2
Fix #3545: XGDMatrixCreateFromCSCEx silently discards empty trailing rows (#3553)
* Fix #3545: XGDMatrixCreateFromCSCEx silently discards empty trailing rows

Description: The bug is triggered when

1. The data matrix has empty rows at the bottom. More precisely, the rows
   `n-k+1`, `n-k+2`, ..., `n` of the matrix have missing values in all
   dimensions (`n` number of instances, `k` number of trailing rows)
2. The data matrix is given as Compressed Sparse Column (CSC) format.

Diagnosis: When the CSC matrix is converted to Compressed Sparse Row (CSR)
format (this is common format used for DMatrix), the trailing empty rows
are silently ignored. More specifically, the row pointer (`offset`) of the
newly created CSR matrix does not take account of these rows.

Fix: Modify the row pointer.

* Add regression test
2018-08-05 10:15:42 -07:00
Brandon Greenwell
b5fad42da2 Issue warning when requesting bivariate plotting (#3516) 2018-07-27 16:15:37 -07:00
Henry Gouk
64b8cffde3 Refactor of FastHistMaker to allow for custom regularisation methods (#3335)
* Refactor to allow for custom regularisation methods

* Implement compositional SplitEvaluator framework

* Fixed segfault when no monotone_constraints are supplied.

* Change pid to parentID

* test_monotone_constraints.py now passes

* Refactor ColMaker and DistColMaker to use SplitEvaluator

* Performance optimisation when no monotone_constraints specified

* Fix linter messages

* Fix a few more linter errors

* Update the amalgamation

* Add bounds check

* Add check for leaf node

* Fix linter error in param.h

* Fix clang-tidy errors on CI

* Fix incorrect function name

* Fix clang-tidy error in updater_fast_hist.cc

* Enable SSE2 for Win32 R MinGW

Addresses https://github.com/dmlc/xgboost/pull/3335#issuecomment-400535752

* Add contributor
2018-06-28 07:37:25 +00:00
Tong He
e6696337e4 Fix CRAN check for lintr (#3372)
* fix CRAN check

* Update submodules dmlc-core and rabit

* Add kintr to rmingw test
2018-06-18 12:53:52 -07:00
Ryota Suzuki
b7cbec4d4b Fix print.xgb.Booster for R (#3338)
* Fix print.xgb.Booster

valid_handle should be TRUE when x$handle is NOT null

* Update xgb.Booster.R

Modify is.null.handle to return TRUE for NULL handle
2018-05-29 11:44:55 -07:00
Philip Hyunsu Cho
71e226120a
For CRAN submission, remove all #pragma's that suppress compiler warnings (#3329)
* For CRAN submission, remove all #pragma's that suppress compiler warnings

A few headers in dmlc-core contain #pragma's that disable compiler warnings,
which is against the CRAN submission policy. Fix the problem by removing
the offending #pragma's as part of the command `make Rbuild`.

This addresses issue #3322.

* Fix script to improve Cygwin/MSYS compatibility

We need this to pass rmingw CI test

* Remove remove_warning_suppression_pragma.sh from packaged tarball
2018-05-23 09:58:39 -07:00
Tong He
098075b81b
CRAN Submission for 0.71.1 (#3311)
* fix for CRAN manual checks

* fix for CRAN manual checks

* pass local check

* fix variable naming style

* Adding Philip's record
2018-05-14 17:32:39 -07:00
Brandon Greenwell
d13f1a0f16 Fix typo (#3305) 2018-05-09 10:18:36 -07:00
Thomas J. Leeper
c2b647f26e fix typo in README (#3263) 2018-04-22 09:24:38 -04:00
Philip Hyunsu Cho
230cb9b787
Release version 0.71 (#3200) 2018-04-11 21:43:32 +09:00
Tong He
ace4016c36
Replace cBind by cbind (#3203)
* modify test_helper.R

* fix noLD

* update desc

* fix solaris test

* fix desc

* improve fix

* fix url

* change Matrix cBind to cbind

* fix

* fix error in demo

* fix examples
2018-03-28 10:05:47 -07:00
Yuan (Terry) Tang
92782a8406
Change DESCRIPTION to more modern look (#3179)
So other things can be added in comment field, such as ORCID.
2018-03-23 10:45:10 -04:00
Arjan van der Velde
04221a7469 rank_metric: add AUC-PR (#3172)
* rank_metric: add AUC-PR

Implementation of the AUC-PR calculation for weighted data, proposed by Keilwagen, Grosse and Grau (https://doi.org/10.1371/journal.pone.0092209)

* rank_metric: fix lint warnings

* Implement tests for AUC-PR and fix implementation

* add aucpr to documentation for other languages
2018-03-23 10:43:47 -04:00
Vadim Khotilovich
706be4e5d4
Additional improvements for gblinear (#3134)
* fix rebase conflict

* [core] additional gblinear improvements

* [R] callback for gblinear coefficients history

* force eta=1 for gblinear python tests

* add top_k to GreedyFeatureSelector

* set eta=1 in shotgun test

* [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater

* set sorted flag within TryInitColData

* gblinear tests: use scale, add external memory test

* fix multiclass for greedy updater

* fix whitespace

* fix typo
2018-03-13 01:27:13 -05:00
Vadim Khotilovich
9ffe8596f2
[core] fix slow predict-caching with many classes (#3109)
* fix prediction caching inefficiency for multiclass

* silence some warnings

* redundant if

* workaround for R v3.4.3 bug; fixes #3081
2018-02-15 18:31:42 -06:00
Tong He
98be9aef9a
A fix for CRAN submission of version 0.7-0 (#3061)
* modify test_helper.R

* fix noLD

* update desc

* fix solaris test

* fix desc

* improve fix

* fix url
2018-01-27 17:06:28 -08:00
Vadim Khotilovich
526801cdb3 [R] fix for the 32 bit windows issue (#2994)
* [R] disable thred_local for 32bit windows

* [R] require C++11 and GNU make in DESCRIPTION

* [R] enable 32+64 build and check in appveyor
2017-12-31 14:18:50 -08:00
Vadim Khotilovich
76f8f51438
[R] AppVeyor CI for R package (#2954)
* [R] fix finding R.exe with cmake on WIN when it is in PATH

* [R] appveyor config for R package

* [R] wrap the lines to make R check happier

* [R] install only binary dep-packages in appveyor

* [R] for MSVC appveyor, also build a binary for R package and keep as an artifact
2017-12-17 16:37:45 -06:00
Vadim Khotilovich
e8a6597957 [R] maintenance Nov 2017; SHAP plots (#2888)
* [R] fix predict contributions for data with no colnames

* [R] add a render parameter for xgb.plot.multi.trees; fixes #2628

* [R] update Rd's

* [R] remove unnecessary dep-package from R cmake install

* silence type warnings; readability

* [R] silence complaint about incomplete line at the end

* [R] initial version of xgb.plot.shap()

* [R] more work on xgb.plot.shap

* [R] enforce black font in xgb.plot.tree; fixes #2640

* [R] if feature names are available, check in predict that they are the same; fixes #2857

* [R] cran check and lint fixes

* remove tabs

* [R] add references; a test for plot.shap
2017-12-05 09:45:34 -08:00
Scott Lundberg
78c4188cec SHAP values for feature contributions (#2438)
* SHAP values for feature contributions

* Fix commenting error

* New polynomial time SHAP value estimation algorithm

* Update API to support SHAP values

* Fix merge conflicts with updates in master

* Correct submodule hashes

* Fix variable sized stack allocation

* Make lint happy

* Add docs

* Fix typo

* Adjust tolerances

* Remove unneeded def

* Fixed cpp test setup

* Updated R API and cleaned up

* Fixed test typo
2017-10-12 12:35:51 -07:00
Vadim Khotilovich
74db9757b3 [R package] GPU support (#2732)
* [R] MSVC compatibility

* [GPU] allow seed in BernoulliRng up to size_t and scale to uint32_t

* R package build with cmake and CUDA

* R package CUDA build fixes and cleanups

* always export the R package native initialization routine on windows

* update the install instructions doc

* fix lint

* use static_cast directly to set BernoulliRng seed

* [R] demo for GPU accelerated algorithm

* tidy up the R package cmake stuff

* R pack cmake: installs main dependency packages if needed

* [R] version bump in DESCRIPTION

* update NEWS

* added short missing/sparse values explanations to FAQ
2017-09-28 18:15:28 -05:00
Bernie Gray
cd7659937b [R] many minor changes to increase the robustness of the R code (#2404)
* many minor changes to increase robustness of R code

* fixing which mistake in xgb.model.dt.tree.R and a few cosmetics
2017-06-15 22:56:23 -05:00
Vadim Khotilovich
c82276386d [R] xgb.importance: fix for multiclass gblinear, new 'trees' parameter (#2388) 2017-06-07 13:13:21 -05:00
Michaël Benesty
8e2a1ff2bf Improve setinfo documentation on R package (#2357) 2017-05-30 20:08:31 +02:00
davidt0x
b29b7d1d76 Fixed loop bound in create.new.tree.features (#2328)
for loop in create.new.tree.features was referencing length(trees) as the upper bound of the loop. trees is a base R dataset and not the model that the code is generating. Changed loop boundary to model$niter which should be the number of trees.
2017-05-30 17:50:33 +02:00
Vadim Khotilovich
da1629e848 [gbtree] fix update process to work with multiclass and multitree; fixes #2315 (#2332) 2017-05-21 23:47:57 -05:00
Vadim Khotilovich
b52db87d5c adding feature contributions to R and gblinear (#2295)
* [gblinear] add features contribution prediction; fix DumpModel bug

* [gbtree] minor changes to PredContrib

* [R] add feature contribution prediction to R

* [R] bump up version; update NEWS

* [gblinear] fix the base_margin issue; fixes #1969

* [R] list of matrices as output of multiclass feature contributions

* [gblinear] make order of DumpModel coefficients consistent: group index changes the fastest
2017-05-21 07:41:51 -04:00
Vadim Khotilovich
c66ca79221 [R] native routines registration (#2290)
* [R] add native routines registration

* c_api.h needs to include <cstdint> since it uses fixed width integer types

* [R] use registered native routines from R code

* [R] bump version; add info on native routine registration to the contributors guide

* make lint happy
2017-05-14 11:00:46 -07:00
Dmitry Nikulin
98ea461532 Fix typo (#2264) 2017-05-07 16:54:48 -07:00
Vadim Khotilovich
a375ad2822 [R] maintenance Apr 2017 (#2237)
* [R] make sure things work for a single split model; fixes #2191

* [R] add option use_int_id to xgb.model.dt.tree

* [R] add example of exporting tree plot to a file

* [R] set save_period = NULL as default in xgboost() to be the same as in xgb.train; fixes #2182

* [R] it's a good practice after CRAN releases to bump up package version in dev

* [R] allow xgb.DMatrix construction from integer dense matrices

* [R] xgb.DMatrix: silent parameter; improve documentation

* [R] xgb.model.dt.tree code style changes

* [R] update NEWS with parameter changes

* [R] code safety & style; handle non-strict matrix and inherited classes of input and model; fixes #2242

* [R] change to x.y.z.p R-package versioning scheme and set version to 0.6.4.3

* [R] add an R package versioning section to the contributors guide

* [R] R-package/README.md: clean up the redundant old installation instructions, link the contributors guide
2017-05-01 22:51:34 -07:00
Qiang Kou (KK)
c441d0916e fix #2228 (#2238) 2017-04-29 18:44:08 -07:00
Seong-Jin Kim
8222755564 Fix typo in R-package README.md (#2190) 2017-04-13 20:22:23 +02:00
Luckick
b0c972aa4d Typo Issue (#2100)
Contruct to Construct
2017-03-16 10:38:25 -07:00
moqiguzhu
5d093a7f4c in caret settings, if you want do 10*10 cross validation, you need to set repeats=10, number=10 and method=repeatedcv, (#2061)
if you set method=cv, actually just one 10-fold cross validation will be run; fixes #2055
2017-02-25 09:16:19 -05:00
Vadim Khotilovich
b4d97d3cb8 R maintenance Feb2017 (#2045)
* [R] better argument check in xgb.DMatrix; fixes #1480

* [R] showsd was a dummy; fixes #2044

* [R] better categorical encoding explanation in vignette; fixes #1989

* [R] new roxygen version docs update
2017-02-20 10:02:40 -08:00
Vadim Khotilovich
2b5b96d760 [R] various R code maintenance (#1964)
* [R] xgb.save must work when handle in nil but raw exists

* [R] print.xgb.Booster should still print other info when handle is nil

* [R] rename internal function xgb.Booster to xgb.Booster.handle to make its intent clear

* [R] rename xgb.Booster.check to xgb.Booster.complete and make it visible; more docs

* [R] storing evaluation_log should depend only on watchlist, not on verbose

* [R] reduce the excessive chattiness of unit tests

* [R] only disable some tests in windows when it's not 64-bit

* [R] clean-up xgb.DMatrix

* [R] test xgb.DMatrix loading from libsvm text file

* [R] store feature_names in xgb.Booster, use them from utility functions

* [R] remove non-functional co-occurence computation from xgb.importance

* [R] verbose=0 is enough without a callback

* [R] added forgotten xgb.Booster.complete.Rd; cran check fixes

* [R] update installation instructions
2017-01-21 11:22:46 -08:00
Vadim Khotilovich
87e897f428 [R] fix #1903 (#1929) 2017-01-06 13:16:37 -08:00
Vadim Khotilovich
d7406e07f3 [R] xgb.plot.tree fixes (#1939)
* [R] a few fixes and improvements to xgb.plot.tree

* [R] deprecate n_first_tree replace with trees; fix types in xgb.model.dt.tree
2017-01-06 11:09:51 -08:00
Tong He
ce84af7923 0.6-4 submission (#1935) 2017-01-04 23:31:05 -08:00
Tong He
f5c85836bf [R] Increase the version number, date and required R version (#1920)
* remove unnecessary line
2016-12-30 21:29:26 -08:00
Qiang Kou (KK)
7948d1c799 disable openmp on solaris (#1912) 2016-12-28 11:32:56 -08:00
Tong He
fa97259d66 Bump up version number, add cleanup script (#1886)
* fix cran check

* change required R version because of utils::globalVariables

* temporary commit, monotone not working

* fix test

* fix doc

* fix doc

* fix cran note and warning

* improve checks

* fix urls

* fix cran check

* add cleanup and bump up version number

* use clean in build

* Update Makefile
2016-12-18 15:11:43 -08:00
Yixuan Qiu
b14994aeff [R Package] Use the C++ 11 compiler to test OpenMP flags (#1881)
* fix segfault when gctorture() is enabled

* use the C++ 11 compiler to test OpenMP flags

* auto-generated configure script
2016-12-16 15:11:06 -08:00
Qiang Kou (KK)
5ebd8fb809 autoconf for solaris (#1880) 2016-12-16 21:56:10 +01:00
Tong He
674024c53a [R] Fix for cran submission of xgboost 0.6 (#1875)
fix cran check
2016-12-15 12:04:54 -08:00
Vadim Khotilovich
b21e658a02 [R-package] JSON dump format and a couple of bugfixes (#1855)
* [R-package] JSON tree dump interface

* [R-package] precision bugfix in xgb.attributes

* [R-package] bugfix for cb.early.stop called from xgb.cv

* [R-package] a bit more clarity on labels checking in xgb.cv

* [R-package] test JSON dump for gblinear as well

* whitespace lint
2016-12-11 19:48:39 +01:00
Vadim Khotilovich
a44032d095 [CORE] The update process for a tree model, and its application to feature importance (#1670)
* [CORE] allow updating trees in an existing model

* [CORE] in refresh updater, allow keeping old leaf values and update stats only

* [R-package] xgb.train mod to allow updating trees in an existing model

* [R-package] added check for nrounds when is_update

* [CORE] merge parameter declaration changes; unify their code style

* [CORE] move the update-process trees initialization to Configure; rename default process_type to 'default'; fix the trees and trees_to_update sizes comparison check

* [R-package] unit tests for the update process type

* [DOC] documentation for process_type parameter; improved docs for updater, Gamma and Tweedie; added some parameter aliases; metrics indentation and some were non-documented

* fix my sloppy merge conflict resolutions

* [CORE] add a TreeProcessType enum

* whitespace fix
2016-12-04 09:33:52 -08:00
Tong He
2f3958a455 Fix for CRAN Submission (#1826)
* fix cran check

* change required R version because of utils::globalVariables

* temporary commit, monotone not working

* fix test

* fix doc

* fix doc

* fix cran note and warning

* improve checks

* fix urls
2016-12-02 20:19:03 -08:00
Yuan (Terry) Tang
80c8515457 Bump up the date of R package (#1813) 2016-11-25 03:20:18 -05:00
Simon DENEL
58aa1129ea Fixing a few typos (#1771)
* Fixing a few typos

* Fixing a few typos
2016-11-13 15:47:52 -08:00