81 Commits

Author SHA1 Message Date
Philip Hyunsu Cho
82d846bbeb
Update change_scala_version.py to also change scala.version property (#9897) 2023-12-18 23:49:41 -08:00
Philip Hyunsu Cho
71d330afdc
Bump version to 2.0.3 (#9895) 2023-12-14 17:54:05 -08:00
Philip Hyunsu Cho
3acbd8692b
[jvm-packages] Fix POM for xgboost-jvm metapackage (#9893)
* [jvm-packages] Fix POM for xgboost-jvm metapackage

* Add script for updating the Scala version
2023-12-14 16:50:34 -08:00
Philip Hyunsu Cho
41ce8f28b2
[jvm-packages] Add Scala version suffix to xgboost-jvm package (#9776)
* Update JVM script (#9714)

* Bump version to 2.0.2; revamp pom.xml

* Update instructions in prepare_jvm_release.py

* Fix formatting
2023-11-08 10:17:26 -08:00
Jiaming Yuan
58aa98a796
Bump version to 2.0.1. (#9660) 2023-10-13 08:47:32 +08:00
Jiaming Yuan
096047c547
Make 2.0 release. (#9567) 2023-09-12 00:20:49 +08:00
Jiaming Yuan
4301558a57
Make 2.0.0 RC1. (#9492) 2023-08-17 16:16:51 +08:00
Jiaming Yuan
f4fb2be101
[jvm-packages] Add the new device parameter. (#9385) 2023-07-17 18:40:39 +08:00
Boris
a01df102c9
Scala 2.13 support. (#9099)
1. Updated the test logic
2. Added smoke tests for Spark examples.
3. Added integration tests for Spark with Scala 2.13
2023-05-27 19:34:02 +08:00
Jiaming Yuan
1f9a57d17b
[Breaking] Require format to be specified in input URI. (#9077)
Previously, we use `libsvm` as default when format is not specified. However, the dmlc
data parser is not particularly robust against errors, and the most common type of error
is undefined format.

Along with which, we will recommend users to use other data loader instead. We will
continue the maintenance of the parsers as it's currently used for many internal tests
including federated learning.
2023-04-28 19:45:15 +08:00
Boris
0e7377ba9c
Updated flink 1.8 -> 1.17. Added smoke tests for Flink (#9046) 2023-04-26 18:41:11 +08:00
Jiaming Yuan
26209a42a5
Define git attributes for renormalization. (#8921) 2023-03-16 02:43:11 +08:00
dependabot[bot]
4a99c9bdb8
Bump commons-lang3 from 3.9 to 3.12.0 in /jvm-packages (#8548)
Bumps commons-lang3 from 3.9 to 3.12.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-lang3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-06 20:13:46 +08:00
Philip Hyunsu Cho
2546d139d6
[jvm-packages] Add missing commons-lang3 dependency to xgboost4j-gpu (#8508)
* [jvm-packages] Add missing commons-lang3 dependency to xgboost4j-gpu

* Update commons-lang3
2022-12-01 16:27:11 -08:00
Jiaming Yuan
f73520bfff
Bump development version to 2.0. (#8390) 2022-10-28 15:21:19 +08:00
Jiaming Yuan
f835368bcf
Mark next release as 1.7 instead of 2.0 (#8281) 2022-09-28 14:33:37 +08:00
Bobby Wang
9fa7ed1743
[Breaking][jvm-packages] remove timeoutRequestWorkers parameter (#7839) 2022-05-13 16:26:25 +08:00
Jiaming Yuan
522636cb52
Bump version. (#7769) 2022-03-31 06:33:22 +08:00
Jiaming Yuan
f7caac2563
Bump version to 1.6.0 in master. (#7259) 2021-10-07 16:09:26 +08:00
Jiaming Yuan
fbd58bf190
[jvm-packages] Create demo and test for xgboost4j early stopping. (#7252) 2021-09-25 03:29:27 +08:00
Jiaming Yuan
146549260a
Bump version to 1.5.0 snapshot in master. (#6875) 2021-04-22 01:53:44 +08:00
Philip Hyunsu Cho
0d483cb7c1
Bump version to 1.4.0 snapshot in master (#6486) 2020-12-10 07:38:08 -08:00
Philip Hyunsu Cho
b3193052b3
Bump version to 1.3.0 snapshot in master (#6052) 2020-08-23 17:13:46 -07:00
Bobby Wang
8943eb4314
[BLOCKING] [jvm-packages] add gpu_hist and enable gpu scheduling (#5171)
* [jvm-packages] add gpu_hist tree method

* change updater hist to grow_quantile_histmaker

* add gpu scheduling

* pass correct parameters to xgboost library

* remove debug info

* add use.cuda for pom

* add CI for gpu_hist for jvm

* add gpu unit tests

* use gpu node to build jvm

* use nvidia-docker

* Add CLI interface to create_jni.py using argparse

Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-26 21:53:24 -07:00
Philip Hyunsu Cho
073b625bde
Bump version to 1.2.0 snapshot in master (#5733) 2020-05-31 00:11:34 -07:00
Philip Hyunsu Cho
7ac7e8778f
Port patches from 1.0.0 branch (#5336)
* Remove f-string, since it's not supported by Python 3.5 (#5330)

* Remove f-string, since it's not supported by Python 3.5

* Add Python 3.5 to CI, to ensure compatibility

* Remove duplicated matplotlib

* Show deprecation notice for Python 3.5

* Fix lint

* Fix lint

* Fix a unit test that mistook MINOR ver for PATCH ver

* Enforce only major version in JSON model schema

* Bump version to 1.1.0-SNAPSHOT
2020-02-21 13:13:21 -08:00
Jiaming Yuan
010b8f1428 Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876)" (#4965)
This reverts commit 86ed01c4bbecef66e1bc4d02fb13116bd6130fae.
2019-10-18 14:02:35 -07:00
Chen Qin
86ed01c4bb [jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876)
* Expose sets of rabit configurations to spark layer
2019-10-18 15:07:31 -04:00
Nan Zhu
1595e3f57b
upgrade version num (#4670)
* upgrade version num

* missign changes

* fix version script

* change versions

* rm files

* Update CMakeLists.txt
2019-07-17 15:25:35 -07:00
koertkuipers
3c506b076e [jvm-packages] upgrade to Scala 2.12 (#4574)
* bump scala to 2.12 which requires java 8 and also newer flink and akka

* put scala version in artifactId

* fix appveyor

* fix for scaladoc issue that looks like https://github.com/scala/bug/issues/10509

* fix ci_build

* update versions in generate_pom.py

* fix generate_pom.py

* apache does not have a download for spark 2.4.3 distro using scala 2.12 yet, so for now i use a tgz i put on s3

* Upload spark-2.4.3-bin-scala2.12-hadoop2.7.tgz to our own S3

* Update Dockerfile.jvm_cross

* Update Dockerfile.jvm_cross
2019-07-16 08:43:34 -07:00
Philip Hyunsu Cho
515f5f5c47
[RFC] Version 0.90 release candidate (#4475)
* Release 0.90

* Add script to automatically generate acknowledgment

* Update NEWS.md
2019-05-20 01:02:44 -07:00
Philip Hyunsu Cho
ea850ecd20
[CI] Refactor Jenkins CI pipeline + migrate all Linux tests to Jenkins (#4401)
* All Linux tests are now in Jenkins CI
* Tests are now de-coupled from builds. We can now build XGBoost with one version of CUDA/JDK and test it with another version of CUDA/JDK
* Builds (compilation) are significantly faster because 1) They use C5 instances with faster CPU cores; and 2) build environment setup is cached using Docker containers
2019-04-26 18:39:12 -07:00
Nan Zhu
5f34078fba
[jvm-packages] bump version for master (#4209)
* update version

* bump version
2019-03-04 23:12:24 -08:00
Nan Zhu
ae3bb9c2d5
Distributed Fast Histogram Algorithm (#4011)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* init

* allow hist algo

* more changes

* temp

* update

* remove hist sync

* udpate rabit

* change hist size

* change the histogram

* update kfactor

* sync per node stats

* temp

* update

* final

* code clean

* update rabit

* more cleanup

* fix errors

* fix failed tests

* enforce c++11

* fix lint issue

* broadcast subsampled feature correctly

* revert some changes

* fix lint issue

* enable monotone and interaction constraints

* don't specify default for monotone and interactions

* update docs
2019-02-05 05:12:53 -08:00
Nan Zhu
c055a32609
[jvm-packages]support multiple validation datasets in Spark (#3910)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* wrap iterators

* enable copartition training and validationset

* add parameters

* converge code path and have init unit test

* enable multi evals for ranking

* unit test and doc

* update example

* fix early stopping

* address the offline comments

* udpate doc

* test eval metrics

* fix compilation issue

* fix example
2018-12-17 21:03:57 -08:00
Nan Zhu
dc2bfbfde1
[jvm-packages] update version to 0.82-SNAPSHOT (#3920)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* update version

* 0.82
2018-11-18 16:47:48 -08:00
ajing
0ddb8a7661 Update README.md (#3872)
SparkWithDataFrame was not there anymore. So replace with SparkMLlibPipeline.scala
2018-11-12 11:03:13 -08:00
Philip Hyunsu Cho
78ec77fa97
Release 0.81 version (#3864)
* Release 0.81 version

* Update NEWS.md
2018-11-04 05:49:11 -08:00
Nan Zhu
79d854c695
[jvm-packages] fix errors in example (#3719)
* add back train method but mark as deprecated

* fix scalastyle error

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* instrumentation

* use log console

* better measurement

* fix erros in example

* update histmaker
2018-09-22 16:39:38 -07:00
Philip Hyunsu Cho
6288f6d563 Update JVM packages version to 0.81-SNAPSHOT (#3584) 2018-08-13 10:17:52 -07:00
Philip Hyunsu Cho
96826a3515
Release version 0.80 (#3541)
* Up versions

* Write release note for 0.80
2018-08-13 01:38:37 -07:00
Nan Zhu
31d1baba3d [jvm-packages] Tutorial of XGBoost4J-Spark (#3534)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* add new

* update doc

* finish Gang Scheduling

* more

* intro

* Add sections: Prediction, Model persistence and ML pipeline.

* Add XGBoost4j-Spark MLlib pipeline example

* partial finished version

* finish the doc

* adjust code

* fix the doc

* use rst

* Convert XGBoost4J-Spark tutorial to reST

* Bring XGBoost4J up to date

* add note about using hdfs

* remove duplicate file

* fix descriptions

* update doc

* Wrap HDFS/S3 export support as a note

* update

* wrap indexing_mode example in code block
2018-08-03 21:17:50 -07:00
Nan Zhu
e2f09db77a
[jvm-packages] minor fix for parameter name in example (#3507) 2018-07-25 19:57:40 -07:00
Yanbo Liang
2c4359e914 [jvm-packages] XGBoost Spark integration refactor (#3387)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* [jvm-packages] XGBoost Spark integration refactor. (#3313)

* XGBoost Spark integration refactor.

* Make corresponding update for xgboost4j-example

* Address comments.

* [jvm-packages] Refactor XGBoost-Spark params to make it compatible with both XGBoost and Spark MLLib (#3326)

* Refactor XGBoost-Spark params to make it compatible with both XGBoost and Spark MLLib

* Fix extra space.

* [jvm-packages] XGBoost Spark supports ranking with group data. (#3369)

* XGBoost Spark supports ranking with group data.

* Use Iterator.duplicate to prevent OOM.

* Update CheckpointManagerSuite.scala

* Resolve conflicts
2018-06-18 15:39:18 -07:00
Nan Zhu
f66731181f
Update 0.8 version num (#3358)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* update 0.80
2018-06-02 07:06:01 -07:00
Nan Zhu
e1f57b4417
[jvm-packages] scripts to cross-build and deploy artifacts to github (#3276)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* cross building files

* update

* build with docker

* remove

* temp

* update build script

* update pom

* update

* update version

* upload build

* fix path

* update README.md

* fix compiler version to 4.8.5
2018-04-28 07:41:30 -07:00
Yanbo Liang
4850f67b85 Fix broken link for xgboost-spark example. (#3275) 2018-04-26 06:45:01 -07:00
Nan Zhu
25b2919c44
[jvm-packages] change version of jvm to keep consistent with other pkgs (#3253)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* change version of jvm to keep consistent with other pkgs
2018-04-19 20:48:50 -07:00
Nan Zhu
14c6392381
[jvm-packages] add dev script to update version and update versions (#2998)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* add dev script to update version and update versions
2018-01-01 21:28:53 -08:00
Sergei Lebedev
69c3b78a29 [jvm-packages] Implemented early stopping (#2710)
* Allowed subsampling test from the training data frame/RDD

The implementation requires storing 1 - trainTestRatio points in memory
to make the sampling work.

An alternative approach would be to construct the full DMatrix and then
slice it deterministically into train/test. The peak memory consumption
of such scenario, however, is twice the dataset size.

* Removed duplication from 'XGBoost.train'

Scala callers can (and should) use names to supply a subset of
parameters. Method overloading is not required.

* Reuse XGBoost seed parameter to stabilize train/test splitting

* Added early stopping support to non-distributed XGBoost

Closes #1544

* Added early-stopping to distributed XGBoost

* Moved construction of 'watches' into a separate method

This commit also fixes the handling of 'baseMargin' which previously
was not added to the validation matrix.

* Addressed review comments
2017-09-29 12:06:22 -07:00