12 Commits

Author SHA1 Message Date
Jiaming Yuan
010b8f1428 Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876)" (#4965)
This reverts commit 86ed01c4bbecef66e1bc4d02fb13116bd6130fae.
2019-10-18 14:02:35 -07:00
Chen Qin
86ed01c4bb [jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876)
* Expose sets of rabit configurations to spark layer
2019-10-18 15:07:31 -04:00
Xu Xiao
277e25797b [jvm-packages] refine numAliveCores method of SparkParallelismTracker (#4858)
* refine numAliveCores

* refine XGBoostToMLlibParams

* fix waitForCondition

* resolve conflicts

* Update SparkParallelismTracker.scala
2019-09-19 15:18:29 -07:00
Nan Zhu
abffbe014e
[jvm-packages] delete all constraints from spark layer about obj and eval metrics and handle error in jvm layer (#4560)
* temp

* prediction part

* remove supported*

* add for test

* fix param name

* add rabit

* update rabit

* return value of rabit init

* eliminate compilation warnings

* update rabit

* shutdown

* update rabit again

* check sparkcontext shutdown

* fix logic

* sleep

* fix tests

* test with relaxed threshold

* create new thread each time

* stop for job quitting

* udpate rabit

* update rabit

* update rabit

* update git modules
2019-06-27 08:47:37 -07:00
Nan Zhu
05243642bb
[jvm-packages] better fix for shutdown applications (#4108)
* intentionally failed task

* throw exception

* more

* stop sparkcontext directly

* stop from another thread

* new scope

* use a new thread

* daemon threads

* don't join the killer thread

* remove injected errors

* add comments
2019-02-07 09:02:17 -08:00
Nan Zhu
e290ec9a80
[jvm-packages] fix safe execution (#4046) 2019-01-05 19:45:37 -08:00
Nan Zhu
3261002099
[jvm-packages] throw ControlThrowable instead of InterruptedException (#3632)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* interrupted exception is not rethrown
2018-08-25 20:30:21 -07:00
Mathew
06ef4db4cc Fix Spark 2.2 Support (Amending #3062) (#3325)
This pull request amends the broken #3062 allow Spark 2.2 to work.

Please note this won't work in Spark <=2.1 as sc.removeSparkListener was implemented in Spark 2.2. (So perhaps a more general method is better, although that is what was attempted in #3062)

This PR fixes: #3208, #3151 and the discussion in #1927.

I do find it strange that #3062 dose not work in Spark 2.2, it's probably due to some sort of public/private issue in the org.apache.spark.scheduler.LiveListenerBus class inheritance (In Spark itself). The error is: `java.lang.NoSuchMethodError: org.apache.spark.scheduler.LiveListenerBus.removeListener(Ljava/lang/Object;)V`
2018-08-12 18:35:20 -07:00
Nan Zhu
6cf97b4eae
[jvm-packages] consider spark.task.cpus when controlling parallelism (#3530)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* consider spark.task.cpus when controlling parallelism

* fix bug

* fix conf setup

* calculate requestedCores within ParallelismController

* enforce spark.task.cpus = 1

* unify unit test case framework

* enable spark ui
2018-07-31 06:19:45 -07:00
tomasatdatabricks
5ef684641b Fixed SparkParallelTracker to work with Spark2.3 (#3062) 2018-01-25 04:31:38 +01:00
Nan Zhu
005a4a5e47
[jvm-packages] fix numAliveCores in SparkParallelismTracker when WebUI is disabled (#2990)
* add back train method but mark as deprecated

* add back train method but mark as deprecated

* fix scalastyle error

* fix scalastyle error

* update resource files

* Update SparkParallelismTracker.scala

* remove xgboost-tracker.properties
2017-12-29 19:22:58 -08:00
Yun Ni
b678e1711d [jvm-packages] Add SparkParallelismTracker to prevent job from hanging (#2697)
* Add SparkParallelismTracker to prevent job from hanging

* Code review comments

* Code Review Comments

* Fix unit tests

* Changes and unit test to catch the corner case.

* Update documentations

* Small improvements

* cancalAllJobs is problematic with scalatest. Remove it

* Code Review Comments

* Check number of executor cores beforehand, and throw exeception if any core is lost.

* Address CR Comments

* Add missing class

* Fix flaky unit test

* Address CR comments

* Remove redundant param for TaskFailedListener
2017-10-16 20:18:47 -07:00