63 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
b38c636d05
Fix #3523: Fix CustomGlobalRandomEngine for R (#3781)
**Symptom** Apple Clang's implementation of `std::shuffle` expects doesn't work
correctly when it is run with the random bit generator for R package:
```cpp
CustomGlobalRandomEngine::result_type
CustomGlobalRandomEngine::operator()() {
  return static_cast<result_type>(
      std::floor(unif_rand() * CustomGlobalRandomEngine::max()));
}
```

Minimial reproduction of failure (compile using Apple Clang 10.0):
```cpp
std::vector<int> feature_set(100);
std::iota(feature_set.begin(), feature_set.end(), 0);
    // initialize with 0, 1, 2, 3, ..., 99
std::shuffle(feature_set.begin(), feature_set.end(), common::GlobalRandom());
    // This returns 0, 1, 2, ..., 99, so content didn't get shuffled at all!!!
```

Note that this bug is platform-dependent; it does not appear when GCC or
upstream LLVM Clang is used.

**Diagnosis** Apple Clang's `std::shuffle` expects 32-bit integer
inputs, whereas `CustomGlobalRandomEngine::operator()` produces 64-bit
integers.

**Fix** Have `CustomGlobalRandomEngine::operator()` produce 32-bit integers.

Closes #3523.
2018-10-15 09:39:13 -07:00
Philip Hyunsu Cho
fbe9d41dd0 Disable flaky tests in R-package/tests/testthat/test_update.R (#3723) 2018-09-26 14:21:41 -07:00
Vadim Khotilovich
ad3a0bbab8
Add the missing max_delta_step (#3668)
* add max_delta_step to SplitEvaluator

* test for max_delta_step

* missing x2 factor for L1 term

* remove gamma from ElasticNet
2018-09-12 08:43:41 -05:00
Philip Hyunsu Cho
c87153ed32
Fix CRAN check by removing reference to std::cerr (#3660)
* Fix CRAN check by removing reference to std::cerr

* Mask tests that fail on 32-bit Windows R
2018-09-05 11:44:00 -07:00
Andrew Thia
9254c58e4d [TREE] add interaction constraints (#3466)
* add interaction constraints

* enable both interaction and monotonic constraints at the same time

* fix lint

* add R test, fix lint, update demo

* Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping

* Add Python test for interaction constraints

* make R interaction constraints parameter based on feature index instead of column names, fix R coding style

* Fix lint

* Add BlueTea88 to CONTRIBUTORS.md

* Short circuit when no constraint is specified; address review comments

* Add tutorial for feature interaction constraints

* allow interaction constraints to be passed as string, remove redundant column_names argument

* Fix typo

* Address review comments

* Add comments to Python test
2018-09-04 09:35:39 -07:00
Vadim Khotilovich
5b662cbe1c
[R] R-interface for SHAP interactions (#3636)
* add R-interface for SHAP interactions

* update docs for new roxygen version
2018-08-30 19:06:21 -05:00
Jakob Richter
725f4c36f2 replace nround with nrounds to match actual parameter (#3592) 2018-08-15 11:13:53 -07:00
Philip Hyunsu Cho
109473dae2
Fix #3545: XGDMatrixCreateFromCSCEx silently discards empty trailing rows (#3553)
* Fix #3545: XGDMatrixCreateFromCSCEx silently discards empty trailing rows

Description: The bug is triggered when

1. The data matrix has empty rows at the bottom. More precisely, the rows
   `n-k+1`, `n-k+2`, ..., `n` of the matrix have missing values in all
   dimensions (`n` number of instances, `k` number of trailing rows)
2. The data matrix is given as Compressed Sparse Column (CSC) format.

Diagnosis: When the CSC matrix is converted to Compressed Sparse Row (CSR)
format (this is common format used for DMatrix), the trailing empty rows
are silently ignored. More specifically, the row pointer (`offset`) of the
newly created CSR matrix does not take account of these rows.

Fix: Modify the row pointer.

* Add regression test
2018-08-05 10:15:42 -07:00
Tong He
098075b81b
CRAN Submission for 0.71.1 (#3311)
* fix for CRAN manual checks

* fix for CRAN manual checks

* pass local check

* fix variable naming style

* Adding Philip's record
2018-05-14 17:32:39 -07:00
Vadim Khotilovich
706be4e5d4
Additional improvements for gblinear (#3134)
* fix rebase conflict

* [core] additional gblinear improvements

* [R] callback for gblinear coefficients history

* force eta=1 for gblinear python tests

* add top_k to GreedyFeatureSelector

* set eta=1 in shotgun test

* [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater

* set sorted flag within TryInitColData

* gblinear tests: use scale, add external memory test

* fix multiclass for greedy updater

* fix whitespace

* fix typo
2018-03-13 01:27:13 -05:00
Tong He
98be9aef9a
A fix for CRAN submission of version 0.7-0 (#3061)
* modify test_helper.R

* fix noLD

* update desc

* fix solaris test

* fix desc

* improve fix

* fix url
2018-01-27 17:06:28 -08:00
Vadim Khotilovich
e8a6597957 [R] maintenance Nov 2017; SHAP plots (#2888)
* [R] fix predict contributions for data with no colnames

* [R] add a render parameter for xgb.plot.multi.trees; fixes #2628

* [R] update Rd's

* [R] remove unnecessary dep-package from R cmake install

* silence type warnings; readability

* [R] silence complaint about incomplete line at the end

* [R] initial version of xgb.plot.shap()

* [R] more work on xgb.plot.shap

* [R] enforce black font in xgb.plot.tree; fixes #2640

* [R] if feature names are available, check in predict that they are the same; fixes #2857

* [R] cran check and lint fixes

* remove tabs

* [R] add references; a test for plot.shap
2017-12-05 09:45:34 -08:00
Scott Lundberg
78c4188cec SHAP values for feature contributions (#2438)
* SHAP values for feature contributions

* Fix commenting error

* New polynomial time SHAP value estimation algorithm

* Update API to support SHAP values

* Fix merge conflicts with updates in master

* Correct submodule hashes

* Fix variable sized stack allocation

* Make lint happy

* Add docs

* Fix typo

* Adjust tolerances

* Remove unneeded def

* Fixed cpp test setup

* Updated R API and cleaned up

* Fixed test typo
2017-10-12 12:35:51 -07:00
Vadim Khotilovich
c82276386d [R] xgb.importance: fix for multiclass gblinear, new 'trees' parameter (#2388) 2017-06-07 13:13:21 -05:00
Vadim Khotilovich
da1629e848 [gbtree] fix update process to work with multiclass and multitree; fixes #2315 (#2332) 2017-05-21 23:47:57 -05:00
Vadim Khotilovich
b52db87d5c adding feature contributions to R and gblinear (#2295)
* [gblinear] add features contribution prediction; fix DumpModel bug

* [gbtree] minor changes to PredContrib

* [R] add feature contribution prediction to R

* [R] bump up version; update NEWS

* [gblinear] fix the base_margin issue; fixes #1969

* [R] list of matrices as output of multiclass feature contributions

* [gblinear] make order of DumpModel coefficients consistent: group index changes the fastest
2017-05-21 07:41:51 -04:00
Vadim Khotilovich
a375ad2822 [R] maintenance Apr 2017 (#2237)
* [R] make sure things work for a single split model; fixes #2191

* [R] add option use_int_id to xgb.model.dt.tree

* [R] add example of exporting tree plot to a file

* [R] set save_period = NULL as default in xgboost() to be the same as in xgb.train; fixes #2182

* [R] it's a good practice after CRAN releases to bump up package version in dev

* [R] allow xgb.DMatrix construction from integer dense matrices

* [R] xgb.DMatrix: silent parameter; improve documentation

* [R] xgb.model.dt.tree code style changes

* [R] update NEWS with parameter changes

* [R] code safety & style; handle non-strict matrix and inherited classes of input and model; fixes #2242

* [R] change to x.y.z.p R-package versioning scheme and set version to 0.6.4.3

* [R] add an R package versioning section to the contributors guide

* [R] R-package/README.md: clean up the redundant old installation instructions, link the contributors guide
2017-05-01 22:51:34 -07:00
Vadim Khotilovich
2b5b96d760 [R] various R code maintenance (#1964)
* [R] xgb.save must work when handle in nil but raw exists

* [R] print.xgb.Booster should still print other info when handle is nil

* [R] rename internal function xgb.Booster to xgb.Booster.handle to make its intent clear

* [R] rename xgb.Booster.check to xgb.Booster.complete and make it visible; more docs

* [R] storing evaluation_log should depend only on watchlist, not on verbose

* [R] reduce the excessive chattiness of unit tests

* [R] only disable some tests in windows when it's not 64-bit

* [R] clean-up xgb.DMatrix

* [R] test xgb.DMatrix loading from libsvm text file

* [R] store feature_names in xgb.Booster, use them from utility functions

* [R] remove non-functional co-occurence computation from xgb.importance

* [R] verbose=0 is enough without a callback

* [R] added forgotten xgb.Booster.complete.Rd; cran check fixes

* [R] update installation instructions
2017-01-21 11:22:46 -08:00
Tong He
ce84af7923 0.6-4 submission (#1935) 2017-01-04 23:31:05 -08:00
Vadim Khotilovich
b21e658a02 [R-package] JSON dump format and a couple of bugfixes (#1855)
* [R-package] JSON tree dump interface

* [R-package] precision bugfix in xgb.attributes

* [R-package] bugfix for cb.early.stop called from xgb.cv

* [R-package] a bit more clarity on labels checking in xgb.cv

* [R-package] test JSON dump for gblinear as well

* whitespace lint
2016-12-11 19:48:39 +01:00
Vadim Khotilovich
a44032d095 [CORE] The update process for a tree model, and its application to feature importance (#1670)
* [CORE] allow updating trees in an existing model

* [CORE] in refresh updater, allow keeping old leaf values and update stats only

* [R-package] xgb.train mod to allow updating trees in an existing model

* [R-package] added check for nrounds when is_update

* [CORE] merge parameter declaration changes; unify their code style

* [CORE] move the update-process trees initialization to Configure; rename default process_type to 'default'; fix the trees and trees_to_update sizes comparison check

* [R-package] unit tests for the update process type

* [DOC] documentation for process_type parameter; improved docs for updater, Gamma and Tweedie; added some parameter aliases; metrics indentation and some were non-documented

* fix my sloppy merge conflict resolutions

* [CORE] add a TreeProcessType enum

* whitespace fix
2016-12-04 09:33:52 -08:00
Vadim Khotilovich
f9648ac320 [R-package] store numeric attributes with higher precision (#1628) 2016-10-03 11:01:17 -07:00
Vadim Khotilovich
693ddb860e More robust DMatrix creation from a sparse matrix (#1606)
* [CORE] DMatrix from sparse w/ explicit #col #row; safer arg types

* [python-package] c-api change for _init_from_csr _init_from_csc

* fix spaces

* [R-package] adopt the new XGDMatrixCreateFromCSCEx interface

* [CORE] redirect old sparse creators to new ones
2016-09-25 10:01:22 -07:00
Tong He
4733357278 [R] Monotonic Constraints in Tree Construction (#1557)
* fix cran check

* change required R version because of utils::globalVariables

* temporary commit, monotone not working

* fix test

* fix doc

* fix doc
2016-09-11 22:16:33 -07:00
Vadim Khotilovich
bdfa8c0e09 [R-package] a few fixes for R (#1485)
* [R] fix #1465

* [R] add sanity check to fix #1434

* [R] some clean-ups for custom obj&eval; require maximize only for early stopping
2016-08-20 05:09:03 -05:00
Tong He
b8e6551734 Add unittest for garbage collection's safety in R (#1490)
* Add test for garbage collection safety
2016-08-19 16:55:03 -07:00
Vadim Khotilovich
d5c143367d [R-package] GPL2 dependency reduction and some fixes (#1401)
* [R] do not remove zero coefficients from gblinear dump

* [R] switch from stringr to stringi

* fix #1399

* [R] separate ggplot backend, add base r graphics, cleanup, more plots, tests

* add missing include in amalgamation - fixes building R package in linux

* add forgotten file

* [R] fix DESCRIPTION

* [R] fix travis check issue and some cleanup
2016-07-27 00:05:04 -07:00
Vadim Khotilovich
344d7b4699 [R] disable for now some of the RF tests that fail in travis 2016-06-27 02:49:23 -05:00
Vadim Khotilovich
fd4300b95a [R] additional and modified tests 2016-06-27 02:00:46 -05:00
Vadim Khotilovich
f34f9fb9f7 R-callbacks tests + other tests brushup 2016-06-09 02:53:37 -05:00
Vadim Khotilovich
be65949ba2 xgb.model.dt.tree up to x100 faster 2016-05-17 00:24:06 -05:00
Vadim Khotilovich
8664217a5a [R] more attribute handling functionality 2016-05-14 18:19:18 -05:00
Vadim Khotilovich
79c7c9e5bb R accessors for model attributes 2016-05-02 00:20:44 -05:00
Vadim Khotilovich
4b760762f9 added unit tests for xgb.DMatrix 2016-03-27 19:23:08 -05:00
tqchen
634db18a0f [TRAVIS] cleanup travis script 2016-01-16 10:25:12 -08:00
Groves
cd57ea2784 Add test that model paramaters are accessible within R 2015-12-16 10:24:16 -06:00
pommedeterresautee
1678a6fbdb Increase cover of tests #Rstat 2015-12-02 10:40:15 +01:00
pommedeterresautee
b05d5d3f24 Improve feature importance on GLM model 2015-12-01 18:44:25 +01:00
pommedeterresautee
6ce57d9cf8 Add new tests for helper functions 2015-12-01 15:44:27 +01:00
pommedeterresautee
730bd72056 some fixes for Travis #Rstat 2015-11-30 15:47:10 +01:00
pommedeterresautee
c09c02300a Add new tests for new functions 2015-11-30 15:04:17 +01:00
pommedeterresautee
376ba6912e Update test to take care of API change 2015-11-30 14:08:27 +01:00
terrytangyuan
51ee382517 Frequence to Frequency 2015-11-20 20:25:29 -06:00
terrytangyuan
15a0d27eed Fixed bug in eta decay (+2 squashed commits)
Squashed commits:
[b67caf2] Fix build
[365ceaa] Fixed bug in eta decay
2015-10-31 12:54:27 -04:00
terrytangyuan
888edba03f Added test for eta decay (+3 squashed commits)
Squashed commits:
[9109887] Added test for eta decay(+1 squashed commit)
Squashed commits:
[1336bd4] Added tests for eta decay (+2 squashed commit)
Squashed commits:
[91aac2d] Added tests for eta decay (+1 squashed commit)
Squashed commits:
[3ff48e7] Added test for eta decay
[6bb1eed] Rewrote Rd files
[bf0dec4] Added learning_rates for diff eta in each boosting round
2015-10-31 12:36:29 -04:00
terrytangyuan
c817efbd8a Fix Travis build 2015-10-30 23:41:24 -04:00
terrytangyuan
e23f4ec3db Minor addition to R unit tests 2015-10-30 19:48:00 -05:00
terrytangyuan
5b9e071c18 Fix travis build (+1 squashed commit)
Squashed commits:
[9240d5f] Fix Travis build
2015-10-29 00:28:53 -04:00
terrytangyuan
6024480400 Fixed most of the lint issues 2015-10-28 23:24:17 -04:00
terrytangyuan
8bae715994 Lint fix on infix operators 2015-10-28 23:04:45 -04:00