31 Commits

Author SHA1 Message Date
Jiaming Yuan
f4eb6b984e
[backport] jvm-packages 1.6.1 (#7849)
* [jvm-packages] move the dmatrix building into rabit context (#7823)

This fixes the QuantileDeviceDMatrix in distributed environment.

* [doc] update the jvm tutorial to 1.6.1 [skip ci] (#7834)

* [Breaking][jvm-packages] Use barrier execution mode (#7836)

With the introduction of the barrier execution mode. we don't need to kill SparkContext when some xgboost tasks failed. Instead, Spark will handle the errors for us. So in this PR, `killSparkContextOnWorkerFailure` parameter is deleted.

* [doc] remove the doc about killing SparkContext [skip ci] (#7840)

* [jvm-package] remove the coalesce in barrier mode (#7846)

* [jvm-packages] Fix model compatibility (#7845)

* Ignore all Java exceptions when looking for Linux musl support (#7844)

Co-authored-by: Bobby Wang <wbo4958@gmail.com>
Co-authored-by: Michael Allman <msa@allman.ms>
2022-04-29 17:20:58 +08:00
Jiaming Yuan
ece4dc457b
[backport] Backport jvm changes to 1.6. (#7803)
* [doc] improve xgboost4j-spark-gpu doc [skip ci] (#7793)


Co-authored-by: Sameer Raheja <sameerz@users.noreply.github.com>

* [jvm-packages] fix evaluation when featuresCols is used (#7798)

Co-authored-by: Bobby Wang <wbo4958@gmail.com>
Co-authored-by: Sameer Raheja <sameerz@users.noreply.github.com>
2022-04-13 17:35:29 +08:00
Jiaming Yuan
48aff0eabd
[doc][jvm-packages] Update information about Python tracker. [skip ci] (#7396) 2021-11-05 05:55:13 +08:00
Jiaming Yuan
376b448015
[doc] Fix broken links. (#7341)
* Fix most of the link checks from sphinx.
* Remove duplicate explicit target name.
2021-10-20 14:45:30 +08:00
Jiaming Yuan
fbd58bf190
[jvm-packages] Create demo and test for xgboost4j early stopping. (#7252) 2021-09-25 03:29:27 +08:00
Andrew Ziem
3e7e426b36
Fix spelling in documents (#6948)
* Update roxygen2 doc.

Co-authored-by: fis <jm.yuan@outlook.com>
2021-05-11 20:44:36 +08:00
Jiaming Yuan
74b41637de
Revert "[jvm-packages] Add XGBOOST_RABIT_TRACKER_IP_FOR_TEST to set rabit tracker IP. (#6869)" (#6886)
This reverts commit 2828da3c4c951baa45d1bb6f85c7b3a6657cd607.
2021-04-21 11:20:10 -07:00
Bobby Wang
2828da3c4c
[jvm-packages] Add XGBOOST_RABIT_TRACKER_IP_FOR_TEST to set rabit tracker IP. (#6869)
* Add `XGBOOST_RABIT_TRACKER_IP_FOR_TEST` to set rabit tracker IP

* change spark and rabit tracker IP to 127.0.0.1on GitHub Action.

Co-authored-by: fis <jm.yuan@outlook.com>
2021-04-22 02:00:22 +08:00
Naveed Ahmed Saleem Janvekar
608bda7052
[jvm-packages] add example to handle missing value other than 0 (#5677)
add example to handle missing value other than 0 under Dealing with missing values section
2020-10-28 17:24:35 -07:00
Bobby Wang
00b0ad1293
[Doc] add doc for kill_spark_context_on_worker_failure parameter (#6097)
* [Doc] add doc for kill_spark_context_on_worker_failure parameter

* resolve comments
2020-09-09 21:28:44 -07:00
Philip Hyunsu Cho
1b1969f20d
[jvm-packages] [CI] Create a Maven repository to host SNAPSHOT JARs (#5533) 2020-04-14 19:33:32 -07:00
cpfarrell
9049c7c653 Add new lines for Spark XGBoost missing values section (#5180) 2020-01-07 12:14:16 +08:00
cpfarrell
bc9d88259f [jvm-packages] Allow for bypassing spark missing value check (#4805)
* Allow for bypassing spark missing value check

* Update documentation for dealing with missing values in spark xgboost
2019-12-18 10:48:20 -08:00
Jiaming Yuan
9fc681001a
Copy CMake parameter from dmlc-core. (#4948) 2019-10-17 23:46:32 -04:00
Oleksandr Pryimak
516955564b Update xgboost-spark doc (#4804) 2019-08-27 10:51:00 -07:00
Jiaming Yuan
9494950ee7 Address some sphinx warnings and errors, add doc for building doc. (#4589) 2019-06-20 15:07:36 -07:00
Nan Zhu
adcd8ea7c6 Update xgboost4j_spark_tutorial.rst (#4476) 2019-05-17 04:17:57 +00:00
Shaochen Shi
18e4fc3690 [jvm-packages] Automatically set maximize_evaluation_metrics if not explicitly given in XGBoost4J-Spark (#4446)
* Automatically set maximize_evaluation_metrics if not explicitly given.

* When custom_eval is set, require maximize_evaluation_metrics.

* Update documents on early stop in XGBoost4J-Spark.

* Fix code error.
2019-05-09 12:49:44 -07:00
Philip Hyunsu Cho
ade3f30237
Fix list formatting in missing value tutorial in XGBoost4J-Spark 2019-05-06 14:24:02 -07:00
Philip Hyunsu Cho
b511638ca1
Fix list formatting in missing value tutorial in XGBoost4J-Spark 2019-05-06 14:21:49 -07:00
Daniel Hen
eabcc0e210 [jvm-packages] Tutorial on handling missing values (#4425)
Add tutorial on missing values and how to handle those within XGBoost.
2019-05-06 13:57:18 -07:00
Nan Zhu
65db8d0626
[jvm-packages] support spark 2.4 and compatibility test with previous xgboost version (#4377)
* bump spark version

* keep float.nan

* handle brokenly changed name/value

* add test

* add model files

* add model files

* update doc
2019-04-17 11:33:13 -07:00
Adam November
0c1d5f1120 Fix snapshot artifact name in docs. (#4196) 2019-03-03 13:27:50 -08:00
Yanbo Liang
9fefa2128d [jvm-packages] Fix early stop with xgboost4j-spark (#4176)
* Fix early stop with xgboost4j-spark

* Update XGBoost.java

* Update XGBoost.java

* Update XGBoost.java

To use -Float.MAX_VALUE as the lower bound, in case there is positive metric.

* Only update best score if the current score is better (no update when equal)

* Update xgboost-spark tutorial to fix early stopping docs.
2019-03-01 13:02:57 -08:00
Nan Zhu
c055a32609
[jvm-packages]support multiple validation datasets in Spark (#3910)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* wrap iterators

* enable copartition training and validationset

* add parameters

* converge code path and have init unit test

* enable multi evals for ranking

* unit test and doc

* update example

* fix early stopping

* address the offline comments

* udpate doc

* test eval metrics

* fix compilation issue

* fix example
2018-12-17 21:03:57 -08:00
Philip Hyunsu Cho
583c88bce7 [jvm-packages] Require vanilla Apache Spark (#3854) 2018-11-01 19:15:40 -07:00
Nan Zhu
5fbe230636
[jvm-packages] documenting tracker (#3831)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* documenting tracker

* Make it a separate note
2018-10-25 18:53:46 -07:00
Nan Zhu
4ae225a08d
[Blocking][jvm-packages] fix the early stopping feature (#3808)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* fix scalastyle error

* temp

* add method for classifier and regressor

* update tutorial

* address the comments

* update
2018-10-23 14:53:13 -07:00
Philip Hyunsu Cho
4ed8a88240
Update Python API doc (#3619)
* Add XGBRanker to Python API doc

* Show inherited members of XGBRegressor in API doc, since XGBRegressor uses default methods from XGBModel

* Add table of contents to Python API doc

* Skip JVM doc download if not available

* Show inherited members for XGBRegressor and XGBRanker

* Expose XGBRanker to Python XGBoost module directory

* Add docstring to XGBRegressor.predict() and XGBRanker.predict()

* Fix rendering errors in Python docstrings

* Fix lint
2018-08-22 18:59:30 -07:00
Philip Hyunsu Cho
aa4ee6a0e4
[BLOCKING] Adding JVM doc build to Jenkins CI (#3567)
* Adding Java/Scala doc build to Jenkins CI

* Deploy built doc to S3 bucket

* Build doc only for branches

* Build doc first, to get doc faster for branch updates

* Have ReadTheDocs download doc tarball from S3

* Update JVM doc links

* Put doc build commands in a script

* Specify Spark 2.3+ requirement for XGBoost4J-Spark

* Build GPU wheel without NCCL, to reduce binary size
2018-08-09 13:27:01 -07:00
Nan Zhu
31d1baba3d [jvm-packages] Tutorial of XGBoost4J-Spark (#3534)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* add new

* update doc

* finish Gang Scheduling

* more

* intro

* Add sections: Prediction, Model persistence and ML pipeline.

* Add XGBoost4j-Spark MLlib pipeline example

* partial finished version

* finish the doc

* adjust code

* fix the doc

* use rst

* Convert XGBoost4J-Spark tutorial to reST

* Bring XGBoost4J up to date

* add note about using hdfs

* remove duplicate file

* fix descriptions

* update doc

* Wrap HDFS/S3 export support as a note

* update

* wrap indexing_mode example in code block
2018-08-03 21:17:50 -07:00