Files

Harry Braviner b374e0a7ab [jvm-packages] Allow supression of Rabit output in Booster::train in xgboost4j (#4262 )

* Make train in xgboost4j respect print params

Previously no setting in params argument of Booster::train would prevent
the Rabit.trackerPrint call. This can fill up a lot of screen space in
the case that many folds are being trained.
* Setting "silent" in this map to "true", "True", a non-zero integer, or
  a string that can be parsed to such an int will prevent printing.
* Setting "verbose_eval" to "False" or "false" will prevent printing.
* Setting "verbose_eval" to an int (or a String parseable to an int) n
  will result in printing every n steps, or no printing is n is zero.

This is to match the python behaviour described here:
https://www.kaggle.com/c/rossmann-store-sales/discussion/17499

* Fixed 'slient' typo in xgboost4j test

* private access on two methods

2019-03-21 18:25:12 +08:00

dev

Separate Depthwidth and Lossguide growing policy in fast histogram (#4102 )

2019-02-13 12:56:19 -08:00

xgboost4j

[jvm-packages] Allow supression of Rabit output in Booster::train in xgboost4j (#4262 )

2019-03-21 18:25:12 +08:00

xgboost4j-example

[jvm-packages] bump version for master (#4209 )

2019-03-04 23:12:24 -08:00

xgboost4j-flink

[jvm-packages] bump version for master (#4209 )

2019-03-04 23:12:24 -08:00

xgboost4j-spark

[jvm-packages] logging version number (#4271 )

2019-03-21 18:24:29 +08:00

.gitignore

[DIST] Enable multiple thread and tracker, make rabit and xgboost more thread-safe by using thread local variables.

2016-03-03 20:36:14 -08:00

build_doc.sh

[BLOCKING] Adding JVM doc build to Jenkins CI (#3567 )

2018-08-09 13:27:01 -07:00

checkstyle-suppressions.xml

[jvm-packages] Fixed checkstyle excludes on Windows (#2370 )

2017-06-02 10:14:13 -07:00

checkstyle.xml

apply google-java-style indentation and impose import orders....

2016-03-03 12:59:18 -05:00

create_jni.py

Correct JVM CMake GPU flag. (#4071 )

2019-01-21 20:36:38 +08:00

pom.xml

[jvm-packages] logging version number (#4271 )

2019-03-21 18:24:29 +08:00

README.md

[jvm-packages] a better explanation about the inconsistent issue (#3524 )

2018-07-28 17:34:39 -07:00

scalastyle-config.xml

sketch of xgboost-spark

2016-03-05 08:44:55 -05:00

README.md

XGBoost4J: Distributed XGBoost for Scala/Java

Documentation | Resources | Release Notes

XGBoost4J is the JVM package of xgboost. It brings all the optimizations and power xgboost into JVM ecosystem.

Train XGBoost models in scala and java with easy customizations.
Run distributed xgboost natively on jvm frameworks such as Apache Flink and Apache Spark.

You can find more about XGBoost on Documentation and Resource Page.

Add Maven Dependency

XGBoost4J, XGBoost4J-Spark, etc. in maven repository is compiled with g++-4.8.5

Access release version

maven

<dependency>
    <groupId>ml.dmlc</groupId>
    <artifactId>xgboost4j</artifactId>
    <version>latest_version_num</version>
</dependency>

sbt

 "ml.dmlc" % "xgboost4j" % "latest_version_num"

For the latest release version number, please check here.

if you want to use xgboost4j-spark, you just need to replace xgboost4j with xgboost4j-spark

Access SNAPSHOT version

You need to add github as repo:

maven:

<repository>
  <id>GitHub Repo</id>
  <name>GitHub Repo</name>
  <url>https://raw.githubusercontent.com/CodingCat/xgboost/maven-repo/</url>
</repository>

sbt:

resolvers += "GitHub Repo" at "https://raw.githubusercontent.com/CodingCat/xgboost/maven-repo/"

the add dependency as following:

maven

<dependency>
    <groupId>ml.dmlc</groupId>
    <artifactId>xgboost4j</artifactId>
    <version>latest_version_num</version>
</dependency>

sbt

 "ml.dmlc" % "xgboost4j" % "latest_version_num"

For the latest release version number, please check here.

if you want to use xgboost4j-spark, you just need to replace xgboost4j with xgboost4j-spark

Examples

Full code examples for Scala, Java, Apache Spark, and Apache Flink can be found in the examples package.

NOTE on LIBSVM Format:

There is an inconsistent issue between XGBoost4J-Spark and other language bindings of XGBoost.

When users use Spark to load trainingset/testset in LibSVM format with the following code snippet:

spark.read.format("libsvm").load("trainingset_libsvm")

Spark assumes that the dataset is 1-based indexed. However, when you do prediction with other bindings of XGBoost (e.g. Python API of XGBoost), XGBoost assumes that the dataset is 0-based indexed. It creates a pitfall for the users who train model with Spark but predict with the dataset in the same format in other bindings of XGBoost.