* Workaround a compiler bug in MacOS AppleClang
* [CI] Run C++ test with MacOS Catalina + AppleClang 11.0.3
* [CI] Migrate cmake_test on MacOS from Travis CI to GitHub Actions
* Install OpenMP runtime
* [CI] Use CMake to locate lz4 lib
* Add getNumFeature to the Java API
* Add getNumFeature to the Scala API
* Add unit tests for getNumFeature
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* Fix CMake build with BUILD_STATIC_LIB option
* Disable BUILD_STATIC_LIB option when R/JVM pkg is enabled
* Add objxgboost to install target only when BUILD_STATIC_LIB=ON
* cancel job instead of killing SparkContext
This PR changes the default behavior that kills SparkContext. Instead, This PR
cancels jobs when coming across task failed. That means the SparkContext is
still alive even some exceptions happen.
* add a parameter to control if killing SparkContext
* cancel the jobs the failed task belongs to
* remove the jobId from the map when one job failed.
* resolve comments
We propose to only use the rowHashCode to compute the partitionKey, adding the FeatureValue hashCode does not bring more value and would make the computation slower. Even though a collision would appear at 0.2% with MurmurHash3 this is bearable for partitioning, this won't have any impact on the data balancing.
* Modin DF support
* mode change
* tests were added, ci env was extended
* mode change
* Remove redundant installation of modin
* Add a pytest skip marker for modin
* Install Modin[ray] from PyPI
* fix interfering
* avoid extra conversion
* delete cv test for modin
* revert cv function
Co-authored-by: ShvetsKS <kirill.shvets@intel.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
* Update GPUTreeShap
* Update src/CMakeLists.txt
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* [CI] Improve JVM test in GitHub Actions
* Use env var for Wagon options [skip ci]
* Move the retry flag to pom.xml [skip ci]
* Export env var RABIT_MOCK to run Spark tests [skip ci]
* Correct location of env var
* Re-try up to 5 times [skip ci]
* Don't run distributed training test on Windows
* Fix typo
* Update main.yml
* Fix a unit test on CLI, to handle RC versions
* [CI] Use mgpu machine to run gpu hist unit tests
* [CI] Build GPU-enabled JAR artifact and deploy to xgboost-maven-repo
* [CI] Move lint to GitHub Actions
* [CI] Move Doxygen to GitHub Actions
* [CI] Move Sphinx build test to GitHub Actions
* [CI] Reduce workload for Windows R tests
* [CI] Move clang-tidy to Build stage
The functions featureValueOfSparseVector or featureValueOfDenseVector could return a Float.NaN if the input vectore was containing any missing values. This would make fail the partition key computation and most of the vectors would end up in the same partition. We fix this by avoid returning a NaN and simply use the row HashCode in this case.
We added a test to ensure that the repartition is indeed now uniform on input dataset containing values by checking that the partitions size variance is below a certain threshold.
Signed-off-by: Anthony D'Amato <anthony.damato@hotmail.fr>
* add SHAP summary plot using ggplot2
* Update xgb.plot.shap
* Update example in xgb.plot.shap documentation
* update logic, add tests
* whitespace fixes
* whitespace fixes for test_helpers
* namespace for sd function
* explicitly declare variables that are automatically evaluated by data.table
* Fix R lint
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* fixed some endian issues
* Use dmlc::ByteSwap() to simplify code
* Fix lint check
* [CI] Add test for s390x
* Download latest CMake on s390x
* Fix a bug in my code
* Save magic number in dmatrix with byteswap on big-endian machine
* Save version in binary with byteswap on big-endian machine
* Load scalar with byteswap in MetaInfo
* Add a debugging message
* Handle arrays correctly when byteswapping
* EOF can also be 255
* Handle magic number in MetaInfo carefully
* Skip Tree.Load test for big-endian, since the test manually builds little-endian binary model
* Handle missing packages in Python tests
* Don't use boto3 in model compatibility tests
* Add s390 Docker file for local testing
* Add model compatibility tests
* Add R compatibility test
* Revert "Add R compatibility test"
This reverts commit c2d2bdcb7dbae133cbb927fcd20f7e83ee2b18a8.
Co-authored-by: Qi Zhang <q.zhang@ibm.com>
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>