98 Commits

Author SHA1 Message Date
Christian Lorentzen
cf4f019ed6
[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183)
* Change DefaultEvalMetric of classification from error to logloss

* Change default binary metric in plugin/example/custom_obj.cc

* Set old error metric in python tests

* Set old error metric in R tests

* Fix missed eval metrics and typos in R tests

* Fix setting eval_metric twice in R tests

* Add warning for empty eval_metric for classification

* Fix Dask tests

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-10-02 12:06:47 -07:00
Philip Hyunsu Cho
9e955fb9b0
[R] Check warnings explicitly for model compatibility tests (#6114)
* [R] Check warnings explicitly for model compatibility tests

* Address reviewer's feedback
2020-09-15 10:49:48 -07:00
Jiaming Yuan
b5f52f0b1b
Validate weights are positive values. (#6115) 2020-09-15 09:03:55 +08:00
Vitalie Spinu
1453bee3e7
[R] Remove stringi dependency (#6109)
* [R] Fix empty empty tests and a test warnings

* [R] Remove stringi dependency (fix #5905)

* Fix R lint check

* [R] Fix automatic conversion to factor in R < 4.0.0 in xgb.model.dt.tree

* Add `R` Makefile variable

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-09-12 13:18:08 -07:00
Tong He
3912f3de06
Updates from 1.2.0 cran submission (#6077)
* update for 1.2.0 cran submission

* recover cmakelists

* fix unittest from the shap PR

* trigger CI
2020-09-02 20:50:23 +08:00
Jiaming Yuan
20c95be625
Expand categorical node. (#6028)
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-25 18:53:57 +08:00
Cuong Duong
e51cba6195
Add SHAP summary plot using ggplot2 (#5882)
* add SHAP summary plot using ggplot2

* Update xgb.plot.shap

* Update example in xgb.plot.shap documentation

* update logic, add tests

* whitespace fixes

* whitespace fixes for test_helpers

* namespace for sd function

* explicitly declare variables that are automatically evaluated by data.table

* Fix R lint

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 18:04:09 -07:00
James Lamb
589b385ec6
[R] fix uses of 1:length(x) and other small things (#5992) 2020-08-09 03:31:33 +08:00
Jiaming Yuan
18349a7ccf
[Breaking] Fix custom metric for multi output. (#5954)
* Set output margin to true for custom metric.  This fixes only R and Python.
2020-07-29 19:25:27 +08:00
Philip Hyunsu Cho
5879acde9a
[CI] Improve R linter script (#5944)
* [CI] Move lint to a separate script

* [CI] Improved lintr launcher

* Add lintr as a separate action

* Add custom parsing logic to print out logs

* Fix lintr issues in demos

* Run R demos

* Fix CRAN checks

* Install XGBoost into R env before running lintr

* Install devtools (needed to run demos)
2020-07-27 00:55:35 -07:00
Philip Hyunsu Cho
6347fa1c2e
[R] Enable weighted learning to rank (#5945)
* [R] enable weighted learning to rank

* Add R unit test for ranking

* Fix lint
2020-07-26 21:10:36 -07:00
Philip Hyunsu Cho
ace7fd328b
[R] Add a compatibility layer to load Booster object from an old RDS file (#5940)
* [R] Add a compatibility layer to load Booster from an old RDS
* Modify QuantileHistMaker::LoadConfig() to be backward compatible with 1.1.x
* Add a big warning about compatibility in QuantileHistMaker::LoadConfig()
* Add testing suite
* Discourage use of saveRDS() in CRAN doc
2020-07-26 00:06:49 -07:00
Jiaming Yuan
bc1d3ee230
Fix r early stop with custom objective. (#5923)
* Specify `ntreelimit`.
2020-07-23 03:28:17 +08:00
Jiaming Yuan
8b1afce316
Add Github Action for R. (#5911)
* Fix lintr errors.
2020-07-20 19:23:36 +08:00
James Lamb
c35be9dc40
[R] replace uses of T and F with TRUE and FALSE (#5778)
* [R-package] replace uses of T and F with TRUE and FALSE

* enable linting

* Remove skip

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-06-11 06:08:02 -04:00
Philip Hyunsu Cho
e4f5b6c84f
Port R compatibility patches from 1.0.0 release branch (#5577)
* Don't use memset to set struct when compiling for R

* Support 32-bit Solaris target for R package
2020-04-21 22:51:18 -07:00
Jiaming Yuan
c355ab65ed
Enable parameter validation for R. (#5569)
* Enable parameter validation for R.

* Add test.
2020-04-21 11:19:09 -07:00
Jiaming Yuan
564b22cee5
Restore attributes in complete. (#5573) 2020-04-21 11:06:55 -07:00
Jiaming Yuan
9c1103e06c
[Breaking] Set output margin to True for custom objective. (#5564)
* Set output margin to True for custom objective in Python and R.

* Add a demo for writing multi-class custom objective function.

* Run tests on selected demos.
2020-04-20 20:44:12 +08:00
Jiaming Yuan
e1f22baf8c
Fix slice and get info. (#5552) 2020-04-18 18:00:13 +08:00
Jiaming Yuan
c245eb8755
Fix r interaction constraints (#5543)
* Unify the parsing code.

* Cleanup.
2020-04-18 06:53:51 +08:00
Jiaming Yuan
b56c902841
[R] R raw serialization. (#5123)
* Add bindings for serialization.
* Change `xgb.save.raw' into full serialization instead of simple model.
* Add `xgb.load.raw' for unserialization.
* Run devtools.
2020-04-11 17:16:54 +08:00
Jiaming Yuan
d0b86c75d9
Remove silent parameter. (#5476) 2020-04-03 08:03:26 +08:00
Jiaming Yuan
ed2465cce4
Add configuration to R interface. (#5217)
* Save and load internal parameter configuration as JSON.
2020-02-16 03:01:58 +08:00
Jiaming Yuan
911a902835
Merge model compatibility fixes from 1.0rc branch. (#5305)
* Port test model compatibility.
* Port logit model fix.

https://github.com/dmlc/xgboost/pull/5248
https://github.com/dmlc/xgboost/pull/5281
2020-02-13 20:41:58 +08:00
Jiaming Yuan
5199b86126
Fix R dart prediction. (#5204)
* Fix R dart prediction and add test.
2020-01-16 12:11:04 +08:00
Kodi Arfer
f100b8d878 [Breaking] Don't drop trees during DART prediction by default (#5115)
* Simplify DropTrees calling logic

* Add `training` parameter for prediction method.

* [Breaking]: Add `training` to C API.

* Change for R and Python custom objective.

* Correct comment.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-01-13 21:48:30 +08:00
Kodi Arfer
f2277e7106 Use DART tree weights when computing SHAPs (#5050)
This PR fixes tree weights in dart being ignored when computing contributions.

* Fix ellpack page source link.
* Add tree weights to compute contribution.
2019-12-03 19:55:53 +08:00
Tong He
7b74b1b64d fix additional files note (#4699)
* fix additional files note

* Trigger CI

* Trigger CI
2019-07-25 00:37:38 -07:00
Tong He
8ac8fbef29
[R] Fix CRAN error for Mac OS X (#4672)
* fix cran error for mac os x

* ignore float on windows check for now
2019-07-17 17:55:52 -07:00
Philip Hyunsu Cho
4e9fad74eb
[R] Use built-in label when xgb.DMatrix is given to xgb.cv() (#4631)
* Use built-in label when xgb.DMatrix is given to xgb.cv()

* Add a test

* Fix test

* Bump version number
2019-07-03 01:32:40 -07:00
Jiaming Yuan
5de7e12704
Change obj name to reg:squarederror in learner. (#4427)
* Change memory dump size in R test.
2019-05-06 21:35:35 +08:00
Jiaming Yuan
29a1356669
Deprecate reg:linear' in favor of reg:squarederror'. (#4267)
* Deprecate `reg:linear' in favor of `reg:squarederror'.
* Replace the use of `reg:linear'.
* Replace the use of `silent`.
2019-03-17 17:55:04 +08:00
Tong He
259fb809e9
fix R-devel errors (#4251) 2019-03-12 10:06:54 -07:00
Kodi Arfer
6a569b8cd9 Avoid generating NaNs in UnwoundPathSum (#3943)
* Avoid generating NaNs in UnwoundPathSum.

Kudos to Jakub Zakrzewski for tracking down the bug.

* Add a test
2019-01-03 15:04:46 -08:00
Philip Hyunsu Cho
b38c636d05
Fix #3523: Fix CustomGlobalRandomEngine for R (#3781)
**Symptom** Apple Clang's implementation of `std::shuffle` expects doesn't work
correctly when it is run with the random bit generator for R package:
```cpp
CustomGlobalRandomEngine::result_type
CustomGlobalRandomEngine::operator()() {
  return static_cast<result_type>(
      std::floor(unif_rand() * CustomGlobalRandomEngine::max()));
}
```

Minimial reproduction of failure (compile using Apple Clang 10.0):
```cpp
std::vector<int> feature_set(100);
std::iota(feature_set.begin(), feature_set.end(), 0);
    // initialize with 0, 1, 2, 3, ..., 99
std::shuffle(feature_set.begin(), feature_set.end(), common::GlobalRandom());
    // This returns 0, 1, 2, ..., 99, so content didn't get shuffled at all!!!
```

Note that this bug is platform-dependent; it does not appear when GCC or
upstream LLVM Clang is used.

**Diagnosis** Apple Clang's `std::shuffle` expects 32-bit integer
inputs, whereas `CustomGlobalRandomEngine::operator()` produces 64-bit
integers.

**Fix** Have `CustomGlobalRandomEngine::operator()` produce 32-bit integers.

Closes #3523.
2018-10-15 09:39:13 -07:00
Philip Hyunsu Cho
fbe9d41dd0 Disable flaky tests in R-package/tests/testthat/test_update.R (#3723) 2018-09-26 14:21:41 -07:00
Vadim Khotilovich
ad3a0bbab8
Add the missing max_delta_step (#3668)
* add max_delta_step to SplitEvaluator

* test for max_delta_step

* missing x2 factor for L1 term

* remove gamma from ElasticNet
2018-09-12 08:43:41 -05:00
Philip Hyunsu Cho
c87153ed32
Fix CRAN check by removing reference to std::cerr (#3660)
* Fix CRAN check by removing reference to std::cerr

* Mask tests that fail on 32-bit Windows R
2018-09-05 11:44:00 -07:00
Andrew Thia
9254c58e4d [TREE] add interaction constraints (#3466)
* add interaction constraints

* enable both interaction and monotonic constraints at the same time

* fix lint

* add R test, fix lint, update demo

* Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping

* Add Python test for interaction constraints

* make R interaction constraints parameter based on feature index instead of column names, fix R coding style

* Fix lint

* Add BlueTea88 to CONTRIBUTORS.md

* Short circuit when no constraint is specified; address review comments

* Add tutorial for feature interaction constraints

* allow interaction constraints to be passed as string, remove redundant column_names argument

* Fix typo

* Address review comments

* Add comments to Python test
2018-09-04 09:35:39 -07:00
Vadim Khotilovich
5b662cbe1c
[R] R-interface for SHAP interactions (#3636)
* add R-interface for SHAP interactions

* update docs for new roxygen version
2018-08-30 19:06:21 -05:00
Jakob Richter
725f4c36f2 replace nround with nrounds to match actual parameter (#3592) 2018-08-15 11:13:53 -07:00
Philip Hyunsu Cho
109473dae2
Fix #3545: XGDMatrixCreateFromCSCEx silently discards empty trailing rows (#3553)
* Fix #3545: XGDMatrixCreateFromCSCEx silently discards empty trailing rows

Description: The bug is triggered when

1. The data matrix has empty rows at the bottom. More precisely, the rows
   `n-k+1`, `n-k+2`, ..., `n` of the matrix have missing values in all
   dimensions (`n` number of instances, `k` number of trailing rows)
2. The data matrix is given as Compressed Sparse Column (CSC) format.

Diagnosis: When the CSC matrix is converted to Compressed Sparse Row (CSR)
format (this is common format used for DMatrix), the trailing empty rows
are silently ignored. More specifically, the row pointer (`offset`) of the
newly created CSR matrix does not take account of these rows.

Fix: Modify the row pointer.

* Add regression test
2018-08-05 10:15:42 -07:00
Tong He
098075b81b
CRAN Submission for 0.71.1 (#3311)
* fix for CRAN manual checks

* fix for CRAN manual checks

* pass local check

* fix variable naming style

* Adding Philip's record
2018-05-14 17:32:39 -07:00
Vadim Khotilovich
706be4e5d4
Additional improvements for gblinear (#3134)
* fix rebase conflict

* [core] additional gblinear improvements

* [R] callback for gblinear coefficients history

* force eta=1 for gblinear python tests

* add top_k to GreedyFeatureSelector

* set eta=1 in shotgun test

* [core] fix SparsePage processing in gblinear; col-wise multithreading in greedy updater

* set sorted flag within TryInitColData

* gblinear tests: use scale, add external memory test

* fix multiclass for greedy updater

* fix whitespace

* fix typo
2018-03-13 01:27:13 -05:00
Tong He
98be9aef9a
A fix for CRAN submission of version 0.7-0 (#3061)
* modify test_helper.R

* fix noLD

* update desc

* fix solaris test

* fix desc

* improve fix

* fix url
2018-01-27 17:06:28 -08:00
Vadim Khotilovich
e8a6597957 [R] maintenance Nov 2017; SHAP plots (#2888)
* [R] fix predict contributions for data with no colnames

* [R] add a render parameter for xgb.plot.multi.trees; fixes #2628

* [R] update Rd's

* [R] remove unnecessary dep-package from R cmake install

* silence type warnings; readability

* [R] silence complaint about incomplete line at the end

* [R] initial version of xgb.plot.shap()

* [R] more work on xgb.plot.shap

* [R] enforce black font in xgb.plot.tree; fixes #2640

* [R] if feature names are available, check in predict that they are the same; fixes #2857

* [R] cran check and lint fixes

* remove tabs

* [R] add references; a test for plot.shap
2017-12-05 09:45:34 -08:00
Scott Lundberg
78c4188cec SHAP values for feature contributions (#2438)
* SHAP values for feature contributions

* Fix commenting error

* New polynomial time SHAP value estimation algorithm

* Update API to support SHAP values

* Fix merge conflicts with updates in master

* Correct submodule hashes

* Fix variable sized stack allocation

* Make lint happy

* Add docs

* Fix typo

* Adjust tolerances

* Remove unneeded def

* Fixed cpp test setup

* Updated R API and cleaned up

* Fixed test typo
2017-10-12 12:35:51 -07:00
Vadim Khotilovich
c82276386d [R] xgb.importance: fix for multiclass gblinear, new 'trees' parameter (#2388) 2017-06-07 13:13:21 -05:00
Vadim Khotilovich
da1629e848 [gbtree] fix update process to work with multiclass and multitree; fixes #2315 (#2332) 2017-05-21 23:47:57 -05:00