234 Commits

Author SHA1 Message Date
ShvetsKS
956beead70
Thread local memory allocation for BuildHist (#6358)
* thread mem locality

* fix apply

* cleanup

* fix lint

* fix tests

* simple try

* fix

* fix

* apply comments

* fix comments

* fix

* apply simple comment

Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
2020-11-25 17:50:12 +03:00
Jiaming Yuan
43efadea2e
Deterministic data partitioning for external memory (#6317)
* Make external memory data partitioning deterministic.

* Change the meaning of `page_size` from bytes to number of rows.

* Design a data pool.

* Note for external memory.

* Enable unity build on Windows CI.

* Force garbage collect on test.
2020-11-11 06:11:06 +08:00
Jiaming Yuan
c4da967b5c
Support unity build. (#6295)
* Support unity build.

* Setup on Windows Jenkins.

* Revert "Setup on Windows Jenkins."

This reverts commit 8345cb8d2b009eec8ae9fa6f16412a7c9b6ec12c.
2020-10-28 11:49:28 -07:00
Jiaming Yuan
ddf37cca30
Unify thread configuration. (#6186) 2020-10-19 16:05:42 +08:00
Jiaming Yuan
bed7ae4083
Loop over thrust::reduce. (#6229)
* Check input chunk size of dqdm.
* Add doc for current limitation.
2020-10-14 10:40:56 +13:00
Rory Mitchell
734a911a26
Loop over copy_if (#6201)
* Loop over copy_if

* Catch OOM.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-10-14 10:23:16 +13:00
Jiaming Yuan
2241563f23
Handle duplicated values in sketching. (#6178)
* Accumulate weights in duplicated values.
* Fix device id in iterative dmatrix.
2020-10-10 19:32:44 +08:00
Jiaming Yuan
444131a2e6
Add categorical data support to GPU Hist. (#6164) 2020-09-29 11:27:25 +08:00
Jiaming Yuan
14afdb4d92
Support categorical data in ellpack. (#6140) 2020-09-24 19:28:57 +08:00
Jiaming Yuan
210c131ce7
Support categorical data in GPU sketching. (#6137) 2020-09-21 13:53:06 +08:00
Jiaming Yuan
a069a21e03
Implement intrusive ptr (#6129)
* Use intrusive ptr for JSON.
2020-09-20 20:07:16 +08:00
Jiaming Yuan
e319b63f9e
Merge extract cuts into QuantileContainer. (#6125)
* Use pruning for initial summary construction.
2020-09-18 16:36:39 +08:00
Jiaming Yuan
29b7fea572
Optimize cpu sketch allreduce for sparse data. (#6009)
* Bypass RABIT serialization reducer and use custom allgather based merging.
2020-08-19 10:03:45 +08:00
Qi Zhang
989ddd036f
Swap byte-order in binary serializer to support big-endian arch (#5813)
* fixed some endian issues

* Use dmlc::ByteSwap() to simplify code

* Fix lint check

* [CI] Add test for s390x

* Download latest CMake on s390x

* Fix a bug in my code

* Save magic number in dmatrix with byteswap on big-endian machine

* Save version in binary with byteswap on big-endian machine

* Load scalar with byteswap in MetaInfo

* Add a debugging message

* Handle arrays correctly when byteswapping

* EOF can also be 255

* Handle magic number in MetaInfo carefully

* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model

* Handle missing packages in Python tests

* Don't use boto3 in model compatibility tests

* Add s390 Docker file for local testing

* Add model compatibility tests

* Add R compatibility test

* Revert "Add R compatibility test"

This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.

Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-08-18 14:47:17 -07:00
Jiaming Yuan
4d99c58a5f
Feature weights (#5962) 2020-08-18 19:55:41 +08:00
Jiaming Yuan
674c409e9d
Remove rabit dependency on public headers. (#6005) 2020-08-13 08:26:20 +08:00
Philip Hyunsu Cho
9adb812a0a
RMM integration plugin (#5873)
* [CI] Add RMM as an optional dependency

* Replace caching allocator with pool allocator from RMM

* Revert "Replace caching allocator with pool allocator from RMM"

This reverts commit e15845d4e72e890c2babe31a988b26503a7d9038.

* Use rmm::mr::get_default_resource()

* Try setting default resource (doesn't work yet)

* Allocate pool_mr in the heap

* Prevent leaking pool_mr handle

* Separate EXPECT_DEATH() in separate test suite suffixed DeathTest

* Turn off death tests for RMM

* Address reviewer's feedback

* Prevent leaking of cuda_mr

* Fix Jenkinsfile syntax

* Remove unnecessary function in Jenkinsfile

* [CI] Install NCCL into RMM container

* Run Python tests

* Try building with RMM, CUDA 10.0

* Do not use RMM for CUDA 10.0 target

* Actually test for test_rmm flag

* Fix TestPythonGPU

* Use CNMeM allocator, since pool allocator doesn't yet support multiGPU

* Use 10.0 container to build RMM-enabled XGBoost

* Revert "Use 10.0 container to build RMM-enabled XGBoost"

This reverts commit 789021fa31112e25b683aef39fff375403060141.

* Fix Jenkinsfile

* [CI] Assign larger /dev/shm to NCCL

* Use 10.2 artifact to run multi-GPU Python tests

* Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target

* Rename Conda env rmm_test -> gpu_test

* Use env var to opt into CNMeM pool for C++ tests

* Use identical CUDA version for RMM builds and tests

* Use Pytest fixtures to enable RMM pool in Python tests

* Move RMM to plugin/CMakeLists.txt; use PLUGIN_RMM

* Use per-device MR; use command arg in gtest

* Set CMake prefix path to use Conda env

* Use 0.15 nightly version of RMM

* Remove unnecessary header

* Fix a unit test when cudf is missing

* Add RMM demos

* Remove print()

* Use HostDeviceVector in GPU predictor

* Simplify pytest setup; use LocalCUDACluster fixture

* Address reviewers' commments

Co-authored-by: Hyunsu Cho <chohyu01@cs.wasshington.edu>
2020-08-12 01:26:02 -07:00
Jiaming Yuan
ee70a2380b
Unify CPU hist sketching (#5880) 2020-08-12 01:33:06 +08:00
Philip Hyunsu Cho
71b0528a2f
GPU implementation of AFT survival objective and metric (#5714)
* Add interval accuracy

* De-virtualize AFT functions

* Lint

* Refactor AFT metric using GPU-CPU reducer

* Fix R build

* Fix build on Windows

* Fix copyright header

* Clang-tidy

* Fix crashing demo

* Fix typos in comment; explain GPU ID

* Remove unnecessary #include

* Add C++ test for interval accuracy

* Fix a bug in accuracy metric: use log pred

* Refactor AFT objective using GPU-CPU Transform

* Lint

* Fix lint

* Use Ninja to speed up build

* Use time, not /usr/bin/time

* Add cpu_build worker class, with concurrency = 1

* Use concurrency = 1 only for CUDA build

* concurrency = 1 for clang-tidy

* Address reviewer's feedback

* Update link to AFT paper
2020-07-17 01:18:13 -07:00
Jiaming Yuan
e471056ec4
Fix sketch size calculation. (#5898) 2020-07-17 08:33:16 +08:00
Jiaming Yuan
dd445af56e
Cleanup on device sketch. (#5874)
* Remove old functions.

* Merge weighted and un-weighted into a common interface.
2020-07-14 10:15:54 +08:00
Rong Ou
06320729d4
fix device sketch with weights in external memory mode (#5870) 2020-07-08 08:44:07 +08:00
Jiaming Yuan
048d969be4
Implement GK sketching on GPU. (#5846)
* Implement GK sketching on GPU.
* Strong tests on quantile building.
* Handle sparse dataset by binary searching the column index.
* Hypothesis test on dask.
2020-07-07 12:16:21 +08:00
Jiaming Yuan
4b0852ee41
Use dmlc stream when URI protocol is not local file. (#5857) 2020-07-07 03:07:12 +08:00
Philip Hyunsu Cho
a67bc64819
Add an option to run brute-force test for JSON round-trip (#5804)
* Add an option to run brute-force test for JSON round-trip

* Apply reviewer's feedback

* Remove unneeded objects

* Parallel run.

* Max.

* Use signed 64-bit loop var, to support MSVC

* Add exhaustive test to CI

* Run JSON test in Win build worker

* Revert "Run JSON test in Win build worker"

This reverts commit c97b2c7dda37b3585b445d36961605b79552ca89.

* Revert "Add exhaustive test to CI"

This reverts commit c149c2ce9971a07a7289f9b9bc247818afd5a667.

Co-authored-by: fis <jm.yuan@outlook.com>
2020-06-17 23:46:02 -07:00
Jiaming Yuan
38ee514787
Implement fast number serialization routines. (#5772)
* Implement ryu algorithm.
* Implement integer printing.
* Full coverage roundtrip test.
2020-06-17 12:39:23 +08:00
Jiaming Yuan
1fa84b61c1
Implement Empty method for host device vector. (#5781)
* Fix accessing nullptr.
2020-06-13 19:02:26 +08:00
Jiaming Yuan
3028fa6b42
Implement weighted sketching for adapter. (#5760)
* Bounded memory tests.
* Fixed memory estimation.
2020-06-12 06:20:39 +08:00
ShvetsKS
cd3d14ad0e
Add float32 histogram (#5624)
* new single_precision_histogram param was added.

Co-authored-by: SHVETS, KIRILL <kirill.shvets@intel.com>
Co-authored-by: fis <jm.yuan@outlook.com>
2020-06-03 11:24:53 +08:00
Jiaming Yuan
e533908922
Expose device sketching in header. (#5747) 2020-06-02 13:02:53 +08:00
Jiaming Yuan
29a4cfe400
Group aware GPU sketching. (#5551)
* Group aware GPU weighted sketching.

* Distribute group weights to each data point.
* Relax the test.
* Validate input meta info.
* Fix metainfo copy ctor.
2020-04-20 17:18:52 +08:00
Jiaming Yuan
ccd30e4491
Fix non-openmp build. (#5566)
* Add test to Jenkins.
* Fix threading utils tests.
* Require thread library.
2020-04-20 12:16:38 +08:00
Rory Mitchell
d6d1035950
gpu_hist performance fixes (#5558)
* Remove unnecessary cuda API calls

* Fix histogram memory growth
2020-04-19 12:21:13 +12:00
Rory Mitchell
e268fb0093
Use thrust functions instead of custom functions (#5544) 2020-04-16 21:41:16 +12:00
Rory Mitchell
ca4e05660e
Purge device_helpers.cuh (#5534)
* Simplifications with caching_device_vector

* Purge device helpers
2020-04-15 21:51:56 +12:00
Jiaming Yuan
6671b42dd4
Use ellpack for prediction only when sparsepage doesn't exist. (#5504) 2020-04-10 12:15:46 +08:00
Jiaming Yuan
0012f2ef93
Upgrade clang-tidy on CI. (#5469)
* Correct all clang-tidy errors.
* Upgrade clang-tidy to 10 on CI.

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-04-05 04:42:29 +08:00
Philip Hyunsu Cho
5fc5ec539d
Implement robust regularization in 'survival:aft' objective (#5473)
* Robust regularization of AFT gradient and hessian

* Fix AFT doc; expose it to tutorial TOC

* Apply robust regularization to uncensored case too

* Revise unit test slightly

* Fix lint

* Update test_survival.py

* Use GradientPairPrecise

* Remove unused variables
2020-04-04 12:21:24 -07:00
Jiaming Yuan
86beb68ce8
Implement host span. (#5459) 2020-04-03 10:37:51 +08:00
Jiaming Yuan
29c6ad943a
Prevent copying SimpleDMatrix. (#5453)
* Set default dtor for SimpleDMatrix to initialize default copy ctor, which is
deleted due to unique ptr.

* Remove commented code.
* Remove warning for calling host function (std::max).
* Remove warning for initialization order.
* Remove warning for unused variables.
2020-04-02 07:01:49 +08:00
Jiaming Yuan
babcb996e7
Reduce span check overhead. (#5464) 2020-04-01 22:07:24 +08:00
ShvetsKS
27a8e36fc3
Reducing memory consumption for 'hist' method on CPU (#5334) 2020-03-28 14:45:52 +13:00
Rory Mitchell
13b10a6370
Device dmatrix (#5420) 2020-03-28 14:42:21 +13:00
Jiaming Yuan
4942da64ae
Refactor tests with data generator. (#5439) 2020-03-27 06:44:44 +08:00
Avinash Barnwal
dcf439932a
Add Accelerated Failure Time loss for survival analysis task (#4763)
* [WIP] Add lower and upper bounds on the label for survival analysis

* Update test MetaInfo.SaveLoadBinary to account for extra two fields

* Don't clear qids_ for version 2 of MetaInfo

* Add SetInfo() and GetInfo() method for lower and upper bounds

* changes to aft

* Add parameter class for AFT; use enum's to represent distribution and event type

* Add AFT metric

* changes to neg grad to grad

* changes to binomial loss

* changes to overflow

* changes to eps

* changes to code refactoring

* changes to code refactoring

* changes to code refactoring

* Re-factor survival analysis

* Remove aft namespace

* Move function bodies out of AFTNormal and AFTLogistic, to reduce clutter

* Move function bodies out of AFTLoss, to reduce clutter

* Use smart pointer to store AFTDistribution and AFTLoss

* Rename AFTNoiseDistribution enum to AFTDistributionType for clarity

The enum class was not a distribution itself but a distribution type

* Add AFTDistribution::Create() method for convenience

* changes to extreme distribution

* changes to extreme distribution

* changes to extreme

* changes to extreme distribution

* changes to left censored

* deleted cout

* changes to x,mu and sd and code refactoring

* changes to print

* changes to hessian formula in censored and uncensored

* changes to variable names and pow

* changes to Logistic Pdf

* changes to parameter

* Expose lower and upper bound labels to R package

* Use example weights; normalize log likelihood metric

* changes to CHECK

* changes to logistic hessian to standard formula

* changes to logistic formula

* Comply with coding style guideline

* Revert back Rabit submodule

* Revert dmlc-core submodule

* Comply with coding style guideline (clang-tidy)

* Fix an error in AFTLoss::Gradient()

* Add missing files to amalgamation

* Address @RAMitchell's comment: minimize future change in MetaInfo interface

* Fix lint

* Fix compilation error on 32-bit target, when size_t == bst_uint

* Allocate sufficient memory to hold extra label info

* Use OpenMP to speed up

* Fix compilation on Windows

* Address reviewer's feedback

* Add unit tests for probability distributions

* Make Metric subclass of Configurable

* Address reviewer's feedback: Configure() AFT metric

* Add a dummy test for AFT metric configuration

* Complete AFT configuration test; remove debugging print

* Rename AFT parameters

* Clarify test comment

* Add a dummy test for AFT loss for uncensored case

* Fix a bug in AFT loss for uncensored labels

* Complete unit test for AFT loss metric

* Simplify unit tests for AFT metric

* Add unit test to verify aggregate output from AFT metric

* Use EXPECT_* instead of ASSERT_*, so that we run all unit tests

* Use aft_loss_param when serializing AFTObj

This is to be consistent with AFT metric

* Add unit tests for AFT Objective

* Fix OpenMP bug; clarify semantics for shared variables used in OpenMP loops

* Add comments

* Remove AFT prefix from probability distribution; put probability distribution in separate source file

* Add comments

* Define kPI and kEulerMascheroni in probability_distribution.h

* Add probability_distribution.cc to amalgamation

* Remove unnecessary diff

* Address reviewer's feedback: define variables where they're used

* Eliminate all INFs and NANs from AFT loss and gradient

* Add demo

* Add tutorial

* Fix lint

* Use 'survival:aft' to be consistent with 'survival:cox'

* Move sample data to demo/data

* Add visual demo with 1D toy data

* Add Python tests

Co-authored-by: Philip Cho <chohyu01@cs.washington.edu>
2020-03-25 13:52:51 -07:00
Rory Mitchell
b745b7acce
Fix memory usage of device sketching (#5407) 2020-03-14 13:43:24 +13:00
Rory Mitchell
a38e7bd19c
Sketching from adapters (#5365)
* Sketching from adapters

* Add weights test
2020-03-07 21:07:58 +13:00
Jiaming Yuan
8d06878bf9
Deterministic GPU histogram. (#5361)
* Use pre-rounding based method to obtain reproducible floating point
  summation.
* GPU Hist for regression and classification are bit-by-bit reproducible.
* Add doc.
* Switch to thrust reduce for `node_sum_gradient`.
2020-03-04 15:13:28 +08:00
Egor Smirnov
1b97eaf7a7
Optimized ApplySplit, BuildHist and UpdatePredictCache functions on CPU (#5244)
* Split up sparse and dense build hist kernels.
* Add `PartitionBuilder`.
2020-02-29 16:11:42 +08:00
Philip Hyunsu Cho
7ac7e8778f
Port patches from 1.0.0 branch (#5336)
* Remove f-string, since it's not supported by Python 3.5 (#5330)

* Remove f-string, since it's not supported by Python 3.5

* Add Python 3.5 to CI, to ensure compatibility

* Remove duplicated matplotlib

* Show deprecation notice for Python 3.5

* Fix lint

* Fix lint

* Fix a unit test that mistook MINOR ver for PATCH ver

* Enforce only major version in JSON model schema

* Bump version to 1.1.0-SNAPSHOT
2020-02-21 13:13:21 -08:00