* Fix loading old logit model.
* Add a helper script for converting old pickle file.
* Add version as a model parameter.
* Remove the size check in R test to relax the size constraint.
* Add missing R doc for passing linting. Run devtools.
* Cleanup old model IO logic.
* Test compatibility on CI.
* Make the argument as required.
* Simplify DropTrees calling logic
* Add `training` parameter for prediction method.
* [Breaking]: Add `training` to C API.
* Change for R and Python custom objective.
* Correct comment.
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
* Fix syncing DMatrix columns.
* notes for tree method.
* Enable feature validation for all interfaces except for jvm.
* Better tests for boosting from predictions.
* Disable validation on JVM.
- Install wget explicitly to match openssl.
- Install CMake explicitly.
- Use newer miniconda link.
- Reenable unittests.
- gcc@9 + xcode@10 for osx due to missing <_stdio.h>. Other versions of gcc should also work. But as homebrew pour gcc@9 after update by default, so I just stick with latest version.
- Disabled one external memory test for OSX. Not sure about the thread implementation in there and fixing external memory is beyond the scope of this PR.
- Use Python3 with conda in jvm package.
* Extract interaction constraints from split evaluator.
The reason for doing so is mostly for model IO, where num_feature and interaction_constraints are copied in split evaluator. Also interaction constraint by itself is a feature selector, acting like column sampler and it's inefficient to bury it deep in the evaluator chain. Lastly removing one another copied parameter is a win.
* Enable inc for approx tree method.
As now the implementation is spited up from evaluator class, it's also enabled for approx method.
* Removing obsoleted code in colmaker.
They are never documented nor actually used in real world. Also there isn't a single test for those code blocks.
* Unifying the types used for row and column.
As the size of input dataset is marching to billion, incorrect use of int is subject to overflow, also singed integer overflow is undefined behaviour. This PR starts the procedure for unifying used index type to unsigned integers. There's optimization that can utilize this undefined behaviour, but after some testings I don't see the optimization is beneficial to XGBoost.
* provide the readme
* update for format
* reformat
* reformat -2
* update again
* update format
* update w.r.t yinlou's comments
* Add kubernetes tutorial to Table of Contents
* Style edit
* add interaction constraints
* enable both interaction and monotonic constraints at the same time
* fix lint
* add R test, fix lint, update demo
* Use dmlc::JSONReader to express interaction constraints as nested lists; Use sparse arrays for bookkeeping
* Add Python test for interaction constraints
* make R interaction constraints parameter based on feature index instead of column names, fix R coding style
* Fix lint
* Add BlueTea88 to CONTRIBUTORS.md
* Short circuit when no constraint is specified; address review comments
* Add tutorial for feature interaction constraints
* allow interaction constraints to be passed as string, remove redundant column_names argument
* Fix typo
* Address review comments
* Add comments to Python test
* Revert "Fix #3485, #3540: Don't use dropout for predicting test sets (#3556)"
This reverts commit 44811f233071c5805d70c287abd22b155b732727.
* Document behavior of predict() for DART booster
* Add notice to parameter.rst
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* add new
* update doc
* finish Gang Scheduling
* more
* intro
* Add sections: Prediction, Model persistence and ML pipeline.
* Add XGBoost4j-Spark MLlib pipeline example
* partial finished version
* finish the doc
* adjust code
* fix the doc
* use rst
* Convert XGBoost4J-Spark tutorial to reST
* Bring XGBoost4J up to date
* add note about using hdfs
* remove duplicate file
* fix descriptions
* update doc
* Wrap HDFS/S3 export support as a note
* update
* wrap indexing_mode example in code block
* Change doc build to reST exclusively
* Rewrite Intro doc in reST; create toctree
* Update parameter and contribute
* Convert tutorials to reST
* Convert Python tutorials to reST
* Convert CLI and Julia docs to reST
* Enable markdown for R vignettes
* Done migrating to reST
* Add guzzle_sphinx_theme to requirements
* Add breathe to requirements
* Fix search bar
* Add link to user forum
* Extended monotonic constraints support to 'hist' tree method.
* Added monotonic constraints tests.
* Fix the signature of NoConstraint::CalcSplitGain()
* Document monotonic constraint support in 'hist'
* Update signature of Update to account for latest refactor