[jvm-packages] [doc] Update install doc for JVM packages (#6051)

This commit is contained in:
Philip Hyunsu Cho 2020-08-23 14:14:53 -07:00 committed by GitHub
parent cfced58c1c
commit 4729458a36
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 53 additions and 31 deletions

View File

@ -65,6 +65,8 @@ This will check out the latest stable version from the Maven Central.
For the latest release version number, please check `here <https://github.com/dmlc/xgboost/releases>`_. For the latest release version number, please check `here <https://github.com/dmlc/xgboost/releases>`_.
To enable the GPU algorithm (``tree_method='gpu_hist'``), use artifacts ``xgboost4j-gpu_2.12`` and ``xgboost4j-spark-gpu_2.12`` instead (note the ``gpu`` suffix).
.. note:: Using Maven repository hosted by the XGBoost project .. note:: Using Maven repository hosted by the XGBoost project
There may be some delay until a new release becomes available to Maven Central. If you would like to access the latest release immediately, add the Maven repository hosted by the XGBoost project: There may be some delay until a new release becomes available to Maven Central. If you would like to access the latest release immediately, add the Maven repository hosted by the XGBoost project:
@ -83,6 +85,11 @@ For the latest release version number, please check `here <https://github.com/dm
resolvers += "XGBoost4J Release Repo" at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/" resolvers += "XGBoost4J Release Repo" at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/"
.. note:: Windows not supported in the JVM package
Currently, XGBoost4J-Spark does not support Windows platform, as the distributed training algorithm is inoperational for Windows. Please use Linux or MacOS.
Access SNAPSHOT version Access SNAPSHOT version
----------------------- -----------------------
@ -141,9 +148,8 @@ The SNAPSHOT JARs are hosted by the XGBoost project. Every commit in the ``maste
You can browse the file listing of the Maven repository at https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html. You can browse the file listing of the Maven repository at https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html.
.. note:: Windows not supported by published JARs To enable the GPU algorithm (``tree_method='gpu_hist'``), use artifacts ``xgboost4j-gpu_2.12`` and ``xgboost4j-spark-gpu_2.12`` instead (note the ``gpu`` suffix).
The published JARs from the Maven Central and GitHub currently only supports Linux and MacOS. Windows users should consider building XGBoost4J / XGBoost4J-Spark from the source. Alternatively, checkout pre-built JARs from `criteo-forks/xgboost-jars <https://github.com/criteo-forks/xgboost-jars>`_.
Installation from source Installation from source
======================== ========================

View File

@ -18,11 +18,11 @@ You can find more about XGBoost on [Documentation](https://xgboost.readthedocs.o
## Add Maven Dependency ## Add Maven Dependency
XGBoost4J, XGBoost4J-Spark, etc. in maven repository is compiled with g++-4.8.5 XGBoost4J, XGBoost4J-Spark, etc. in maven repository is compiled with g++-4.8.5.
### Access release version ### Access release version
<b>maven</b> <b>Maven</b>
``` ```
<dependency> <dependency>
@ -30,66 +30,82 @@ XGBoost4J, XGBoost4J-Spark, etc. in maven repository is compiled with g++-4.8.5
<artifactId>xgboost4j_2.12</artifactId> <artifactId>xgboost4j_2.12</artifactId>
<version>latest_version_num</version> <version>latest_version_num</version>
</dependency> </dependency>
``` <dependency>
<groupId>ml.dmlc</groupId>
<b>sbt</b> <artifactId>xgboost4j-spark_2.12</artifactId>
<version>latest_version_num</version>
</dependency>
```
<b>sbt</b>
```sbt ```sbt
"ml.dmlc" %% "xgboost4j" % "latest_version_num" libraryDependencies ++= Seq(
``` "ml.dmlc" %% "xgboost4j" % "latest_version_num",
"ml.dmlc" %% "xgboost4j-spark" % "latest_version_num"
)
```
For the latest release version number, please check [here](https://github.com/dmlc/xgboost/releases). For the latest release version number, please check [here](https://github.com/dmlc/xgboost/releases).
if you want to use `xgboost4j-spark`, you just need to replace xgboost4j with `xgboost4j-spark` To enable the GPU algorithm (`tree_method='gpu_hist'`), use artifacts `xgboost4j-gpu_2.12` and `xgboost4j-spark-gpu_2.12` instead.
### Access SNAPSHOT version ### Access SNAPSHOT version
You need to add github as repo: First add the following Maven repository hosted by the XGBoost project:
<b>maven</b>: <b>Maven</b>:
```xml ```xml
<repository> <repository>
<id>GitHub Repo</id> <id>XGBoost4J Snapshot Repo</id>
<name>GitHub Repo</name> <name>XGBoost4J Snapshot Repo</name>
<url>https://raw.githubusercontent.com/CodingCat/xgboost/maven-repo/</url> <url>https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/snapshot/</url>
</repository> </repository>
``` ```
<b>sbt</b>: <b>sbt</b>:
```sbt ```sbt
resolvers += "GitHub Repo" at "https://raw.githubusercontent.com/CodingCat/xgboost/maven-repo/" resolvers += "XGBoost4J Snapshot Repo" at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/snapshot/"
``` ```
the add dependency as following: Then add XGBoost4J as a dependency:
<b>maven</b> <b>Maven</b>
``` ```
<dependency> <dependency>
<groupId>ml.dmlc</groupId> <groupId>ml.dmlc</groupId>
<artifactId>xgboost4j_2.12</artifactId> <artifactId>xgboost4j_2.12</artifactId>
<version>latest_version_num</version> <version>latest_version_num-SNAPSHOT</version>
</dependency> </dependency>
``` <dependency>
<groupId>ml.dmlc</groupId>
<b>sbt</b> <artifactId>xgboost4j-spark_2.12</artifactId>
<version>latest_version_num-SNAPSHOT</version>
</dependency>
```
<b>sbt</b>
```sbt ```sbt
"ml.dmlc" %% "xgboost4j" % "latest_version_num" libraryDependencies ++= Seq(
``` "ml.dmlc" %% "xgboost4j" % "latest_version_num-SNAPSHOT",
"ml.dmlc" %% "xgboost4j-spark" % "latest_version_num-SNAPSHOT"
)
```
For the latest release version number, please check [here](https://github.com/CodingCat/xgboost/tree/maven-repo/ml/dmlc/xgboost4j_2.12). For the latest release version number, please check [the repository listing](https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/list.html).
if you want to use `xgboost4j-spark`, you just need to replace xgboost4j with `xgboost4j-spark` To enable the GPU algorithm (`tree_method='gpu_hist'`), use artifacts `xgboost4j-gpu_2.12` and `xgboost4j-spark-gpu_2.12` instead.
## Examples ## Examples
Full code examples for Scala, Java, Apache Spark, and Apache Flink can Full code examples for Scala, Java, Apache Spark, and Apache Flink can
be found in the [examples package](https://github.com/dmlc/xgboost/tree/master/jvm-packages/xgboost4j-example). be found in the [examples package](https://github.com/dmlc/xgboost/tree/master/jvm-packages/xgboost4j-example).
**NOTE on LIBSVM Format**: **NOTE on LIBSVM Format**:
There is an inconsistent issue between XGBoost4J-Spark and other language bindings of XGBoost. There is an inconsistent issue between XGBoost4J-Spark and other language bindings of XGBoost.
When users use Spark to load trainingset/testset in LibSVM format with the following code snippet: When users use Spark to load trainingset/testset in LibSVM format with the following code snippet:
@ -108,7 +124,7 @@ You can build/package xgboost4j locally with the following steps:
2. Clone this repo: `git clone --recursive https://github.com/dmlc/xgboost.git` 2. Clone this repo: `git clone --recursive https://github.com/dmlc/xgboost.git`
3. Run the following command: 3. Run the following command:
- With Tests: `./xgboost/jvm-packages/dev/build-linux.sh` - With Tests: `./xgboost/jvm-packages/dev/build-linux.sh`
- Skip Tests: `./xgboost/jvm-packages/dev/build-linux.sh --skip-tests` - Skip Tests: `./xgboost/jvm-packages/dev/build-linux.sh --skip-tests`
**Windows:** **Windows:**
1. Ensure [Docker for Windows](https://docs.docker.com/docker-for-windows/install/) is installed. 1. Ensure [Docker for Windows](https://docs.docker.com/docker-for-windows/install/) is installed.