498 Commits

Author SHA1 Message Date
Jiaming Yuan
a4f5c86276
Allow using RandomState object from Numpy in sklearn interface. (#5049) 2019-11-19 10:56:39 +08:00
Rong Ou
0afcc55d98 Support multiple batches in gpu_hist (#5014)
* Initial external memory training support for GPU Hist tree method.
2019-11-16 14:50:20 +08:00
Jiaming Yuan
97abcc7ee2
Extract interaction constraint from split evaluator. (#5034)
*  Extract interaction constraints from split evaluator.

The reason for doing so is mostly for model IO, where num_feature and interaction_constraints are copied in split evaluator. Also interaction constraint by itself is a feature selector, acting like column sampler and it's inefficient to bury it deep in the evaluator chain. Lastly removing one another copied parameter is a win.

*  Enable inc for approx tree method.

As now the implementation is spited up from evaluator class, it's also enabled for approx method.

*  Removing obsoleted code in colmaker.

They are never documented nor actually used in real world. Also there isn't a single test for those code blocks.

*  Unifying the types used for row and column.

As the size of input dataset is marching to billion, incorrect use of int is subject to overflow, also singed integer overflow is undefined behaviour. This PR starts the procedure for unifying used index type to unsigned integers. There's optimization that can utilize this undefined behaviour, but after some testings I don't see the optimization is beneficial to XGBoost.
2019-11-14 20:11:41 +08:00
sriramch
2abe69d774 - ndcg ltr implementation on gpu (#5004)
* - ndcg ltr implementation on gpu
  - this is a follow-up to the pairwise ltr implementation
2019-11-13 11:21:04 +13:00
Philip Hyunsu Cho
f4e7b707c9
Revert #4529 (#5008)
* Revert " Optimize ‘hist’ for multi-core CPU (#4529)"

This reverts commit 4d6590be3c9a043d44d9e4fe0a456a9f8179ec72.

* Fix build
2019-11-12 09:35:03 -08:00
Jiaming Yuan
7663de956c
Run training with empty DMatrix. (#4990)
This makes GPU Hist robust in distributed environment as some workers might not
be associated with any data in either training or evaluation.

* Disable rabit mock test for now: See #5012 .

* Disable dask-cudf test at prediction for now: See #5003

* Launch dask job for all workers despite they might not have any data.
* Check 0 rows in elementwise evaluation metrics.

   Using AUC and AUC-PR still throws an error.  See #4663 for a robust fix.

* Add tests for edge cases.
* Add `LaunchKernel` wrapper handling zero sized grid.
* Move some parts of allreducer into a cu file.
* Don't validate feature names when the booster is empty.

* Sync number of columns in DMatrix.

  As num_feature is required to be the same across all workers in data split
  mode.

* Filtering in dask interface now by default syncs all booster that's not
empty, instead of using rank 0.

* Fix Jenkins' GPU tests.

* Install dask-cuda from source in Jenkins' test.

  Now all tests are actually running.

* Restore GPU Hist tree synchronization test.

* Check UUID of running devices.

  The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field.

* Fix CMake policy and project variables.

  Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13.

* Fix copying data to CPU

* Fix race condition in cpu predictor.

* Fix duplicated DMatrix construction.

* Don't download extra nccl in CI script.
2019-11-06 16:13:13 +08:00
Chen Qin
b29b8c2f34 [jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4966)
* [phase 1] expose sets of rabit configurations to spark layer

* add back mutable import

* disable ring_mincount till https://github.com/dmlc/rabit/pull/106d

* Revert "disable ring_mincount till https://github.com/dmlc/rabit/pull/106d"

This reverts commit 65e95a98e24f5eb53c6ba9ef9b2379524258984d.

* apply latest rabit

* fix build error

* apply https://github.com/dmlc/xgboost/pull/4880

* downgrade cmake in rabit

* point to rabit with DMLC_ROOT fix

* relative path of rabit install prefix

* split rabit parameters to another trait

* misc

* misc

* Delete .classpath

* Delete .classpath

* Delete .classpath

* Update XGBoostClassifier.scala

* Update XGBoostRegressor.scala

* Update GeneralParams.scala

* Update GeneralParams.scala

* Update GeneralParams.scala

* Update GeneralParams.scala

* Delete .classpath

* Update RabitParams.scala

* Update .gitignore

* Update .gitignore

* apply rabitParams to training

* use string as rabit parameter value type

* cleanup

* add rabitEnv check

* point to dmlc/rabit

* per feedback

* update private scope

* misc

* update rabit

* add rabit_timtout, fix failing test.

* split tests

* allow build jvm with rabit mock

* pass mock failures to rabit with test

* add mock error and graceful handle rabit assertion error test

* split mvn test

* remove sign for test

* update rabit

* build jvm_packages with rabit mock

* point back to dmlc/rabit

* per feedback, update scala header

* cleanup pom

* per feedback

* try fix lint

* fix lint

* per feedback, remove bootstrap_cache

* per feedback 2

* try replace dev profile with passing mvn property

* fix build error

* remove mvn property and replace with env setting to build test jar

* per feedback

* revert copyright headlines, point to dmlc/rabit

* revert python lint

* remove multiple failure test case as retry is not enabled in spark

* Update core.py

* Update core.py

* per feedback, style fix
2019-11-01 14:21:19 -07:00
Philip Hyunsu Cho
da6e74f7bb
[CI] Upload nightly builds to S3 (#4976)
* Do not store built artifacts in the Jenkins master

* Add wheel renaming script

* Upload wheels to S3 bucket

* Use env.GIT_COMMIT

* Capture git hash correctly

* Add missing import in Jenkinsfile

* Address reviewer's comments

* Put artifacts for pull requests in separate directory

* No wildcard expansion in Windows CMD
2019-10-23 21:16:05 -07:00
Jiaming Yuan
ac457c56a2
Use `UpdateAllowUnknown' for non-model related parameter. (#4961)
* Use `UpdateAllowUnknown' for non-model related parameter.

Model parameter can not pack an additional boolean value due to binary IO
format.  This commit deals only with non-model related parameter configuration.

* Add tidy command line arg for use-dmlc-gtest.
2019-10-23 05:50:12 -04:00
Jiaming Yuan
f24be2efb4 Use configure_file() to configure version only (#4974)
* Avoid writing build_config.h

* Remove build_config.h all together.

* Lint.
2019-10-22 23:47:00 -07:00
Rong Ou
5b1715d97c Write ELLPACK pages to disk (#4879)
* add ellpack source
* add batch param
* extract function to parse cache info
* construct ellpack info separately
* push batch to ellpack page
* write ellpack page.
* make sparse page source reusable
2019-10-22 23:44:32 -04:00
sriramch
310fe60b35 Pairwise ranking objective implementation on gpu (#4873)
* - pairwise ranking objective implementation on gpu
   - there are couple of more algorithms (ndcg and map) for which support will be added
     as follow-up pr's
   - with no label groups defined, get gradient is 90x faster on gpu (120m instance
     mortgage dataset)
   - it can perform by an order of magnitude faster with ~ 10 groups (and adequate cores
     for the cpu implementation)

* Add JSON config to rank obj.
2019-10-22 23:40:07 -04:00
Jiaming Yuan
5620322a48
[Breaking] Add global versioning. (#4936)
* Use CMake config file for representing version.

* Generate c and Python version file with CMake.

The generated file is written into source tree.  But unless XGBoost upgrades
its version, there will be no actual modification.  This retains compatibility
with Makefiles for R.

* Add XGBoost version the DMatrix binaries.
* Simplify prefetch detection in CMakeLists.txt
2019-10-22 23:27:26 -04:00
Jiaming Yuan
7e477a2adb
Fix data loading (#4862)
* Fix loading text data.
* Fix config regex.
* Try to explain the error better in exception.
* Update doc.
2019-10-22 12:33:14 -04:00
Philip Hyunsu Cho
95295ce026 [CI] Use latest dask (#4973)
* Remove version spec, to use latest dask always
2019-10-22 07:00:13 -04:00
Jiaming Yuan
4771bb0d41
Catch exception in transform function omp context. (#4960) 2019-10-21 17:03:38 +08:00
Jiaming Yuan
010b8f1428 Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876)" (#4965)
This reverts commit 86ed01c4bbecef66e1bc4d02fb13116bd6130fae.
2019-10-18 14:02:35 -07:00
Chen Qin
86ed01c4bb [jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876)
* Expose sets of rabit configurations to spark layer
2019-10-18 15:07:31 -04:00
Jiaming Yuan
31030a8d3a
Set correct file permission. (#4964) 2019-10-18 12:54:29 -04:00
Jiaming Yuan
ae536756ae
Add Model and Configurable interface. (#4945)
* Apply Configurable to objective functions.
* Apply Model to Learner and Regtree, gbm.
* Add Load/SaveConfig to objs.
* Refactor obj tests to use smart pointer.
* Dummy methods for Save/Load Model.
2019-10-18 01:56:02 -04:00
Rory Mitchell
60748b2071
Use heuristic to select histogram node, avoid rabit call (#4951) 2019-10-18 11:33:54 +13:00
Jiaming Yuan
7e72a12871
Don't set_params at the end of set_state. (#4947)
* Don't set_params at the end of set_state.

* Also fix another issue found in dask prediction.

* Add note about prediction.

Don't support other prediction modes at the moment.
2019-10-15 10:08:26 -04:00
Jiaming Yuan
2ebdec8aa6
Fix dask prediction. (#4941)
* Fix dask prediction.

* Add better error messages for wrong partition.
2019-10-14 23:19:34 -04:00
Jiaming Yuan
b61d534472
Span: use size_t' for index_type, add front' and `back'. (#4935)
* Use `size_t' for index_type.  Add `front' and `back'.

* Remove a batch of `static_cast'.
2019-10-14 09:13:33 -04:00
Jiaming Yuan
3d46bd0fa5
Ignore columnar alignment requirement. (#4928)
* Better error message for wrong type.
* Fix stride size.
2019-10-13 06:41:43 -04:00
Philip Hyunsu Cho
f7487e4c2a [CI] Run cuDF tests in Jenkins CI server (#4927) 2019-10-13 00:04:54 -04:00
Jiaming Yuan
4bbf062ed3
[Breaking] Update sklearn interface. (#4929)
* Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909.
* Check data shape. Fix #4896.
* Check element of eval_set is tuple. Fix #4875
*  Add doc for random_state with hogwild. Fixes #4919
2019-10-12 02:50:09 -04:00
Jiaming Yuan
6c9b6f11da Use cudf.concat explicitly. (#4918)
* Use `cudf.concat` explicitly.

* Add test.
2019-10-10 16:02:10 +13:00
Oleksandr Pryimak
80977182c5 Use bundled gtest (#4900)
* Suggest to use gtest bundled with dmlc

* Use dmlc bundled gtest in all CI scripts

* Make clang-tidy to use dmlc embedded gtest
2019-10-09 16:26:19 -07:00
Jiaming Yuan
095de3bf5f
Export c++ headers in CMake installation. (#4897)
* Move get transpose into cc.

* Clean up headers in host device vector, remove thrust dependency.

* Move span and host device vector into public.

* Install c++ headers.

* Short notes for c and c++.

Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-10-06 23:53:09 -04:00
Jiaming Yuan
d30e63a0a5
Support feature names/types for cudf. (#4902)
* Implement most of the pandas procedure for cudf except for type conversion.
* Requires an array of interfaces in metainfo.
2019-09-29 15:07:51 -04:00
Vibhu Jawa
2fa8b359e0 Add support for cudf.Series (#4891) 2019-09-25 23:52:28 -04:00
Jiaming Yuan
b8433c455a
Rewrite Dask interface. (#4819) 2019-09-25 01:30:14 -04:00
Rong Ou
562bb0ae31 remove device shards (#4867) 2019-09-25 13:15:46 +08:00
Jiaming Yuan
0b89cd1dfa
Support gamma in GPU_Hist. (#4874)
* Just prevent building the tree instead of using an explicit pruner.
2019-09-24 10:16:08 +08:00
Jiaming Yuan
57106a3459
Fix parsing empty json object. (#4868)
* Fix parsing empty json object.

* Better error message.
2019-09-18 03:31:46 -04:00
Jiaming Yuan
5374f52531
Complete cudf support. (#4850)
* Handles missing value.
* Accept all floating point and integer types.
* Move to cudf 9.0 API.
* Remove requirement on `null_count`.
* Arbitrary column types support.
2019-09-16 23:52:00 -04:00
Rong Ou
125bcec62e Move ellpack page construction into DMatrix (#4833) 2019-09-16 23:50:55 -04:00
Chen Qin
512f037e55 [rabit_bootstrap_cache ] failed xgb worker recover from other workers (#4808)
* Better recovery support.  Restarting only the failed workers.
2019-09-16 23:31:52 -04:00
Jiaming Yuan
a5f232feb8
Fix calling GPU predictor (#4836)
* Fix calling GPU predictor
2019-09-05 19:09:38 -04:00
Jiaming Yuan
c0fbeff0ab
Restrict access to cfg_ in gbm. (#4801)
* Restrict access to `cfg_` in gbm.

* Verify having correct updaters.

* Remove `grow_global_histmaker`

This updater is the same as `grow_histmaker`.  The former is not in our
document so we just remove it.
2019-09-02 00:43:19 -04:00
Rong Ou
733ed24dd9 further cleanup of single process multi-GPU code (#4810)
* use subspan in gpu predictor instead of copying
* Revise `HostDeviceVector`
2019-08-30 05:27:23 -04:00
Rong Ou
38ab79f889 Make HostDeviceVector single gpu only (#4773)
* Make HostDeviceVector single gpu only
2019-08-26 09:51:13 +12:00
Jiaming Yuan
6e6216ad67
Skip related tests when sklearn is not installed. (#4791) 2019-08-21 00:32:52 -04:00
Jiaming Yuan
3fa2ceb193
Add self. (#4794) 2019-08-20 14:41:30 +08:00
Jiaming Yuan
9700776597 Cudf support. (#4745)
* Initial support for cudf integration.

* Add two C APIs for consuming data and metainfo.

* Add CopyFrom for SimpleCSRSource as a generic function to consume the data.

* Add FromDeviceColumnar for consuming device data.

* Add new MetaInfo::SetInfo for consuming label, weight etc.
2019-08-19 16:51:40 +12:00
Jiaming Yuan
ab357dd41c
Remove plugin, cuda related code in automake & autoconf files (#4789)
* Build plugin example with CMake.

* Remove plugin, cuda related code in automake & autoconf files.

* Fix typo in GPU doc.
2019-08-18 16:54:34 -04:00
Jiaming Yuan
b9b57f2289 Use long key id. (#4783) 2019-08-16 11:19:22 -07:00
Evan Kepner
53d4272c2a add os.PathLike support for file paths to DMatrix and Booster Python classes (#4757) 2019-08-15 04:46:25 -04:00
Xu Xiao
ef9af33a00 [HOTFIX] distributed training with hist method (#4716)
* add parallel test for hist.EvalualiteSplit

* update test_openmp.py

* update test_openmp.py

* update test_openmp.py

* update test_openmp.py

* update test_openmp.py

* fix OMP schedule policy

* fix clang-tidy

* add logging: total_num_bins

* fix

* fix

* test

* replace guided OPENMP policy with static in updater_quantile_hist.cc
2019-08-13 11:27:29 -07:00