362 Commits

Author SHA1 Message Date
Jiaming Yuan
761a5dbdfc
[dask] Honor nthreads from dask worker. (#5414) 2020-03-16 04:51:24 +08:00
Jiaming Yuan
8d06878bf9
Deterministic GPU histogram. (#5361)
* Use pre-rounding based method to obtain reproducible floating point
  summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
2020-03-04 15:13:28 +08:00
Philip Hyunsu Cho
9775da02d9
Add release note for 1.0.0 in NEWS.md (#5329)
* Add release note for 1.0.0

* Fix a small bug in the Python script that compiles the list of contributors

* Clarify governance of CI infrastructure; now PMC is formally in charge

* Address reviewer comment

* Fix typo
2020-03-03 21:35:43 -08:00
Samrat Pandiri
2d76d40dfd
Update dask.rst to correct a spelling mistake (#5371)
Change `signle-node` to `single-node`
2020-02-27 20:46:41 +08:00
Rong Ou
d6b31df449
update docs for gpu external memory (#5332)
* update docs for gpu external memory

* add hist limitation
2020-02-22 14:57:40 +08:00
Philip Hyunsu Cho
7ac7e8778f
Port patches from 1.0.0 branch (#5336)
* Remove f-string, since it's not supported by Python 3.5 (#5330)

* Remove f-string, since it's not supported by Python 3.5

* Add Python 3.5 to CI, to ensure compatibility

* Remove duplicated matplotlib

* Show deprecation notice for Python 3.5

* Fix lint

* Fix lint

* Fix a unit test that mistook MINOR ver for PATCH ver

* Enforce only major version in JSON model schema

* Bump version to 1.1.0-SNAPSHOT
2020-02-21 13:13:21 -08:00
Jiaming Yuan
e433a379e4
Fix changing locale. (#5314)
* Fix changing locale.

* Don't use locale guard.

As number parsing is implemented in house, we don't need locale.

* Update doc.
2020-02-17 11:31:13 +08:00
Jiaming Yuan
ed2465cce4
Add configuration to R interface. (#5217)
* Save and load internal parameter configuration as JSON.
2020-02-16 03:01:58 +08:00
Jiaming Yuan
911a902835
Merge model compatibility fixes from 1.0rc branch. (#5305)
* Port test model compatibility.
* Port logit model fix.

https://github.com/dmlc/xgboost/pull/5248
https://github.com/dmlc/xgboost/pull/5281
2020-02-13 20:41:58 +08:00
Andrew Kane
94828a7c0c
Updated Windows build docs (#5283) 2020-02-05 12:19:54 +08:00
Jiaming Yuan
595a00466d
Rewrite setup.py. (#5271)
The setup.py is rewritten.  This new script uses only Python code and provide customized
implementation of setuptools commands.  This way users can run most of setuptools commands
just like any other Python libraries.

* Remove setup_pip.py
* Remove soft links.
* Define customized commands.
* Remove shell script.
* Remove makefile script.
* Update the doc for building from source.
2020-02-04 13:35:42 +08:00
Jiaming Yuan
472ded549d
Save Scikit-Learn attributes into learner attributes. (#5245)
* Remove the recommendation for pickle.

* Save skl attributes in booster.attr

* Test loading scikit-learn model with native booster.
2020-01-30 16:00:18 +08:00
Philip Hyunsu Cho
4240daed4e
Make pip install xgboost*.tar.gz work by fixing build-python.sh (#5241)
* Make pip install xgboost*.tar.gz work by fixing build-python.sh

* Simplify install doc

* Add test

* Install Miniconda for Linux target too

* Build XGBoost only once in sdist

* Try importing xgboost after installation

* Don't set PYTHONPATH env var for sdist test
2020-01-28 23:18:23 -08:00
Philip Hyunsu Cho
cb3ed404cf
[R] Enable OpenMP with AppleClang in XGBoost R package (#5240)
* [R] Enable OpenMP with AppleClang in XGBoost R package

* Dramatically simplify install doc
2020-01-28 12:37:22 -08:00
Jiaming Yuan
ef19480eda
Add dart to JSON schema. (#5218)
* Add dart to JSON schema.

* Use spaces instead of tab.
2020-01-28 13:29:09 +08:00
Jiaming Yuan
40680368cf
Add constraint parameters to Scikit-Learn interface. (#5227)
* Add document for constraints.

* Fix a format error in doc for objective function.
2020-01-25 11:12:02 +08:00
Kodi Arfer
f100b8d878 [Breaking] Don't drop trees during DART prediction by default (#5115)
* Simplify DropTrees calling logic

* Add `training` parameter for prediction method.

* [Breaking]: Add `training` to C API.

* Change for R and Python custom objective.

* Correct comment.

Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-01-13 21:48:30 +08:00
Jiaming Yuan
7b65698187
Enforce correct data shape. (#5191)
* Fix syncing DMatrix columns.
* notes for tree method.
* Enable feature validation for all interfaces except for jvm.
* Better tests for boosting from predictions.
* Disable validation on JVM.
2020-01-13 15:48:17 +08:00
cpfarrell
9049c7c653 Add new lines for Spark XGBoost missing values section (#5180) 2020-01-07 12:14:16 +08:00
Jiaming Yuan
ebc86a3afa
Disable parameter validation for Scikit-Learn interface. (#5167)
* Disable parameter validation for now.

Scikit-Learn passes all parameters down to XGBoost, whether they are used or
not.

* Add option `validate_parameters`.
2020-01-07 11:17:31 +08:00
Tim Gates
2d95b9a4b6 Fix simple typo: utilty -> utility (#5182) 2020-01-04 15:28:26 +08:00
Jiaming Yuan
1d0ca49761
Example JSON model parser and Schema. (#5137) 2019-12-23 19:47:35 +08:00
Jiaming Yuan
a4b929385e
Note for DaskDMatrix. (#5144)
* Brief introduction to `DaskDMatrix`.

* Add xgboost.dask.train to API doc
2019-12-23 18:55:32 +08:00
cpfarrell
bc9d88259f [jvm-packages] Allow for bypassing spark missing value check (#4805)
* Allow for bypassing spark missing value check

* Update documentation for dealing with missing values in spark xgboost
2019-12-18 10:48:20 -08:00
Jiaming Yuan
27b3646d29
Tests and documents for new JSON routines. (#5120) 2019-12-18 08:44:27 +08:00
Jiaming Yuan
63ffd2f686
Check against R seed. (#5125)
* Handle it in R instead.
2019-12-17 19:14:59 +08:00
Jiaming Yuan
38763aa4fa
Update document for tree_method. [skip ci] (#5106) 2019-12-09 22:55:00 +08:00
Jiaming Yuan
608ebbe444
Fix GPU ID and prediction cache from pickle (#5086)
* Hack for saving GPU ID.

* Declare prediction cache on GBTree.

* Add a simple test.

* Add `auto` option for GPU Predictor.
2019-12-07 16:02:06 +08:00
yage
dcde433402 Fix MacOS build error. (#5080)
* Update build script.
* Update build doc.
2019-12-04 19:34:35 +08:00
yage
b9dbfe0931 Update doc for building on OSX (#5074)
Co-Authored-By: Jiaming Yuan <jm.yuan@outlook.com>
2019-11-28 20:14:44 +08:00
Jiaming Yuan
9f52e834dc
[doc] Some notes for external memory. (#5065) 2019-11-26 00:22:02 +08:00
Jiaming Yuan
d667ea9335
[CI] Fix Travis tests. (#5062)
- Install wget explicitly to match openssl.
- Install CMake explicitly.
- Use newer miniconda link.
- Reenable unittests.
- gcc@9 + xcode@10 for osx due to missing <_stdio.h>.  Other versions of gcc should also work.  But as homebrew pour gcc@9 after update by default, so I just stick with latest version.
- Disabled one external memory test for OSX.  Not sure about the thread implementation in there and fixing external memory is beyond the scope of this PR.
- Use Python3 with conda in jvm package.
2019-11-25 03:32:10 +08:00
Jiaming Yuan
04c640f562
Add cuDF DataFrame to doc. (#5053) 2019-11-19 18:29:40 +08:00
Rory Mitchell
e67388fb8f
Some guidelines on device memory usage (#5038)
* Add memory usage demo

* Update documentation
2019-11-17 07:48:24 +13:00
Jiaming Yuan
97abcc7ee2
Extract interaction constraint from split evaluator. (#5034)
*  Extract interaction constraints from split evaluator.

The reason for doing so is mostly for model IO, where num_feature and interaction_constraints are copied in split evaluator. Also interaction constraint by itself is a feature selector, acting like column sampler and it's inefficient to bury it deep in the evaluator chain. Lastly removing one another copied parameter is a win.

*  Enable inc for approx tree method.

As now the implementation is spited up from evaluator class, it's also enabled for approx method.

*  Removing obsoleted code in colmaker.

They are never documented nor actually used in real world. Also there isn't a single test for those code blocks.

*  Unifying the types used for row and column.

As the size of input dataset is marching to billion, incorrect use of int is subject to overflow, also singed integer overflow is undefined behaviour. This PR starts the procedure for unifying used index type to unsigned integers. There's optimization that can utilize this undefined behaviour, but after some testings I don't see the optimization is beneficial to XGBoost.
2019-11-14 20:11:41 +08:00
Philip Hyunsu Cho
a37691428f
Document minimum version required for gtest [skip ci] (#5001) 2019-10-31 15:47:50 -07:00
Philip Hyunsu Cho
da6e74f7bb
[CI] Upload nightly builds to S3 (#4976)
* Do not store built artifacts in the Jenkins master

* Add wheel renaming script

* Upload wheels to S3 bucket

* Use env.GIT_COMMIT

* Capture git hash correctly

* Add missing import in Jenkinsfile

* Address reviewer's comments

* Put artifacts for pull requests in separate directory

* No wildcard expansion in Windows CMD
2019-10-23 21:16:05 -07:00
Jiaming Yuan
7e477a2adb
Fix data loading (#4862)
* Fix loading text data.
* Fix config regex.
* Try to explain the error better in exception.
* Update doc.
2019-10-22 12:33:14 -04:00
Philip Hyunsu Cho
741fbf47c4
[CI] Update lint configuration to support latest pylint convention (#4971)
* Update lint configuration

* Use gcc 8 consistently in build instruction
2019-10-21 16:40:57 -07:00
Jiaming Yuan
9fc681001a
Copy CMake parameter from dmlc-core. (#4948) 2019-10-17 23:46:32 -04:00
Jiaming Yuan
185e3f1916
Update GPU doc. (#4953) 2019-10-16 05:54:09 -04:00
Oleksandr Pryimak
80977182c5 Use bundled gtest (#4900)
* Suggest to use gtest bundled with dmlc

* Use dmlc bundled gtest in all CI scripts

* Make clang-tidy to use dmlc embedded gtest
2019-10-09 16:26:19 -07:00
Jiaming Yuan
095de3bf5f
Export c++ headers in CMake installation. (#4897)
* Move get transpose into cc.

* Clean up headers in host device vector, remove thrust dependency.

* Move span and host device vector into public.

* Install c++ headers.

* Short notes for c and c++.

Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-10-06 23:53:09 -04:00
Jiaming Yuan
b8433c455a
Rewrite Dask interface. (#4819) 2019-09-25 01:30:14 -04:00
Philip Hyunsu Cho
0e0955a6d8
Add link to Ruby XGBoost gem (#4856) 2019-09-17 10:40:44 -07:00
Oleksandr Pryimak
516955564b Update xgboost-spark doc (#4804) 2019-08-27 10:51:00 -07:00
Jiaming Yuan
ab357dd41c
Remove plugin, cuda related code in automake & autoconf files (#4789)
* Build plugin example with CMake.

* Remove plugin, cuda related code in automake & autoconf files.

* Fix typo in GPU doc.
2019-08-18 16:54:34 -04:00
sriramch
f22b1c0348 Fix external memory documentation [skip ci] (#4747)
* - fix external memory documentation [skip ci]
   - to state that it is supported now on gpu algorithms
2019-08-08 09:27:02 +08:00
Rong Ou
851b5b3808 Remove gpu_exact tree method (#4742) 2019-08-07 11:43:20 +12:00
Jiaming Yuan
ad1192e8a3
Remove silent in doc. [skip ci] (#4689) 2019-07-20 05:53:42 -04:00