* bump scala to 2.12 which requires java 8 and also newer flink and akka
* put scala version in artifactId
* fix appveyor
* fix for scaladoc issue that looks like https://github.com/scala/bug/issues/10509
* fix ci_build
* update versions in generate_pom.py
* fix generate_pom.py
* apache does not have a download for spark 2.4.3 distro using scala 2.12 yet, so for now i use a tgz i put on s3
* Upload spark-2.4.3-bin-scala2.12-hadoop2.7.tgz to our own S3
* Update Dockerfile.jvm_cross
* Update Dockerfile.jvm_cross
* Reorganize contributor's doc
* Address comments from @trivialfis
* Address @sriramch's comment: include ABI compatibility guarantee
* Address @rongou's comment
* Postpone ABI compatibility guarantee for now
* provide the readme
* update for format
* reformat
* reformat -2
* update again
* update format
* update w.r.t yinlou's comments
* Add kubernetes tutorial to Table of Contents
* Style edit
* Fix#4630, #4421: Preserve correct ordering between metrics, and always use last metric for early stopping
* Clarify semantics of early stopping in presence of multiple valid sets and metrics
* Add a test
* Fix lint
* _maybe_pandas_xxx should return their arguments unchanged if no pandas installed
* Tests should not assume pandas is installed
* Mark tests which require pandas as such
* Fix external memory for get column batches.
This fixes two bugs:
* Use PushCSC for get column batches.
* Don't remove the created temporary directory before finishing test.
* Check all pages.
* Add to documentation how to build native unit tests
* Add instructions to run Python tests and to use Docker container [skip ci]
* Fix link to pytest chapter
* Add link to Google Test [skip ci]
* Set PYTHONPATH [skip ci]
* Revise test_python.sh for running tests locally
* Update test_python.sh
* Place Docker recommendation notice in a prominent place [skip ci]
* Initial performance optimizations for xgboost
* remove includes
* revert float->double
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* fix for CI
* Check existence of _mm_prefetch and __builtin_prefetch
* Fix lint
* optimizations for CPU
* appling comments in review
* add some comments, code refactoring
* fixing issues in CI
* adding runtime checks
* remove 1 extra check
* remove extra checks in BuildHist
* remove checks
* add debug info
* added debug info
* revert changes
* added comments
* Apply suggestions from code review
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* apply review comments
* Remove unused function CreateNewNodes()
* Add descriptive comment on node_idx variable in QuantileHistMaker::Builder::BuildHistsBatch()
* Implement tree model dump with a code generator.
* Split up generators.
* Implement graphviz generator.
* Use pattern matching.
* [Breaking] Return a Source in `to_graphviz` instead of Digraph in Python package.
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* - do not create device vectors for the entire sparse page while computing histograms...
- while creating the compressed histogram indices, the row vector is created for the entire
sparse page batch. this is needless as we only process chunks at a time based on a slice
of the total gpu memory
- this pr will allocate only as much as required to store the ppropriate row indices and the entries
* - do not dereference row_ptrs once the device_vector has been created to elide host copies of those counts
- instead, grab the entry counts directly from the sparsepage
* - set the appropriate device before freeing device memory...
- pr #4532 added a global memory tracker/logger to keep track of number of (de)allocations
and peak memory usage on a per device basis.
- this pr adds the appropriate check to make sure that the (de)allocation counts and memory usages
makes sense for the device. since verbosity is typically increased on debug/non-retail builds.
* - pre-create cub allocators and reuse them
- create them once and not resize them dynamically. we need to ensure that these allocators
are created and destroyed exactly once so that the appropriate device id's are set