Rong Ou
30204b50fe
fix spark tests on machines with many cores ( #4634 )
2019-07-07 16:02:56 -07:00
Philip Hyunsu Cho
d333918f5e
[jvm-packages] Expose setMissing method in XGBoostClassificationModel / XGBoostRegressionModel ( #4643 )
2019-07-07 16:02:44 -07:00
Nan Zhu
abffbe014e
[jvm-packages] delete all constraints from spark layer about obj and eval metrics and handle error in jvm layer ( #4560 )
...
* temp
* prediction part
* remove supported*
* add for test
* fix param name
* add rabit
* update rabit
* return value of rabit init
* eliminate compilation warnings
* update rabit
* shutdown
* update rabit again
* check sparkcontext shutdown
* fix logic
* sleep
* fix tests
* test with relaxed threshold
* create new thread each time
* stop for job quitting
* udpate rabit
* update rabit
* update rabit
* update git modules
2019-06-27 08:47:37 -07:00
Jiaming Yuan
2f1319f273
Add rmsle metric and reg:squaredlogerror objective ( #4541 )
2019-06-11 05:48:27 +08:00
Jiaming Yuan
0ce300e73a
[jvm-packages] Add back reg:linear for scala. ( #4490 )
...
* Add back reg:linear for scala.
* Fix linter.
2019-05-23 15:02:08 -07:00
Philip Hyunsu Cho
515f5f5c47
[RFC] Version 0.90 release candidate ( #4475 )
...
* Release 0.90
* Add script to automatically generate acknowledgment
* Update NEWS.md
2019-05-20 01:02:44 -07:00
Shaochen Shi
18e4fc3690
[jvm-packages] Automatically set maximize_evaluation_metrics if not explicitly given in XGBoost4J-Spark ( #4446 )
...
* Automatically set maximize_evaluation_metrics if not explicitly given.
* When custom_eval is set, require maximize_evaluation_metrics.
* Update documents on early stop in XGBoost4J-Spark.
* Fix code error.
2019-05-09 12:49:44 -07:00
Xu Xiao
797ba8e72d
[jvm-packages] fix compatibility problem of spark version ( #4411 )
...
* fix compatibility problem of spark version on MissingValueHandlingSuite.scala
* call setHandleInvalid by runtime reflection
2019-04-30 09:13:05 -07:00
Nan Zhu
253fdd8a42
[jvm-packages] fix the split of input ( #4417 )
2019-04-29 18:52:40 -07:00
Nan Zhu
37dc82c3ff
[jvm-packages] allow partial evaluation of dataframe before prediction ( #4407 )
...
* allow partial evaluation of dataframe before prediction
* resume spark test
* comments
* Run unit tests after building JVM packages
2019-04-26 21:02:40 -07:00
Nan Zhu
995698b0cb
[BREAKING][jvm-packages] fix the non-zero missing value handling ( #4349 )
...
* fix the nan and non-zero missing value handling
* fix nan handling part
* add missing value
* Update MissingValueHandlingSuite.scala
* Update MissingValueHandlingSuite.scala
* stylistic fix
2019-04-26 11:10:33 -07:00
Xu Xiao
2d875ec019
[BLOCKING][jvm-packages] fix non-deterministic order within a partition (in the case of an upstream shuffle) on prediction ( #4388 )
...
* [jvm-packages][hot-fix] fix column mismatch caused by zip actions at XGBooostModel.transformInternal
* apply minibatch in prediction
* an iterator-compatible minibatch prediction
* regressor impl
* continuous working on mini-batch prediction of xgboost4j-spark
* Update Booster.java
2019-04-26 11:09:20 -07:00
Nan Zhu
65db8d0626
[jvm-packages] support spark 2.4 and compatibility test with previous xgboost version ( #4377 )
...
* bump spark version
* keep float.nan
* handle brokenly changed name/value
* add test
* add model files
* add model files
* update doc
2019-04-17 11:33:13 -07:00
Nan Zhu
ad4de0d718
[jvm-packages] handle NaN as missing value explicitly ( #4309 )
...
* handle nan
* handle nan explicitly
* make code better and handle sparse vector in spark
* Update XGBoostGeneralSuite.scala
2019-03-30 19:34:26 +08:00
Nan Zhu
45c89a6792
[jvm-packages] logging version number ( #4271 )
...
* print version number
* add property file
2019-03-21 18:24:29 +08:00
Nan Zhu
359ed9c5bc
[jvm-packages] add configuration flag to control whether to cache transformed training set ( #4268 )
...
* control whether to cache data
* uncache
2019-03-18 10:13:28 +08:00
Jiaming Yuan
29a1356669
Deprecate reg:linear' in favor of reg:squarederror'. ( #4267 )
...
* Deprecate `reg:linear' in favor of `reg:squarederror'.
* Replace the use of `reg:linear'.
* Replace the use of `silent`.
2019-03-17 17:55:04 +08:00
Shaochen Shi
224786f67f
[xgboost4j-spark] Allow set the parameter "maxLeaves". ( #4226 )
...
* Allow set the parameter "maxLeaves".
* Add "setMaxLeaves" to XGBoostRegressor.
2019-03-07 18:36:47 -08:00
Nan Zhu
5f34078fba
[jvm-packages] bump version for master ( #4209 )
...
* update version
* bump version
2019-03-04 23:12:24 -08:00
Rong Ou
d506a8bc63
[jvm-packages] add verbosity param ( #4138 )
2019-02-13 20:57:17 -08:00
Nan Zhu
c18a3660fa
Separate Depthwidth and Lossguide growing policy in fast histogram ( #4102 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* init
* more changes
* temp
* update
* udpate rabit
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* broadcast subsampled feature correctly
* init col
* temp
* col sampling
* fix histmastrix init
* fix col sampling
* remove cout
* fix out of bound access
* fix core dump
remove core dump file
* disbale test temporarily
* update
* add fid
* print perf data
* update
* revert some changes
* temp
* temp
* pass all tests
* bring back some tests
* recover some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* recover column init part
* more recovery
* fix core dumps
* code clean
* revert some changes
* fix test compilation issue
* fix lint issue
* resolve compilation issue
* fix issues of lint caused by rebase
* fix stylistic changes and change variable names
* use regtree internal function
* modularize depth width
* address the comments
* fix failed tests
* wrap perf timers with class
* fix lint
* fix num_leaves count
* fix indention
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.h
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.cc
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* Update src/tree/updater_quantile_hist.h
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
* merge
* fix compilation
2019-02-13 12:56:19 -08:00
Rong Ou
9b917cda4f
[jvm-packages] fix simple logic error :) ( #4128 )
...
@CodingCat
2019-02-11 21:47:30 -08:00
Nan Zhu
3320a52192
[jvm-packages] force use per-group weights in spark layer ( #4118 )
2019-02-10 05:38:03 +08:00
Rong Ou
2a9b085bc8
[jvm-packages] minor fix of params ( #4114 )
2019-02-08 00:21:59 -08:00
Nan Zhu
05243642bb
[jvm-packages] better fix for shutdown applications ( #4108 )
...
* intentionally failed task
* throw exception
* more
* stop sparkcontext directly
* stop from another thread
* new scope
* use a new thread
* daemon threads
* don't join the killer thread
* remove injected errors
* add comments
2019-02-07 09:02:17 -08:00
Nan Zhu
325b16bccd
[jvm-packages] fix return type of setEvalSets ( #4105 )
2019-02-06 11:00:29 -08:00
Nan Zhu
ae3bb9c2d5
Distributed Fast Histogram Algorithm ( #4011 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* init
* allow hist algo
* more changes
* temp
* update
* remove hist sync
* udpate rabit
* change hist size
* change the histogram
* update kfactor
* sync per node stats
* temp
* update
* final
* code clean
* update rabit
* more cleanup
* fix errors
* fix failed tests
* enforce c++11
* fix lint issue
* broadcast subsampled feature correctly
* revert some changes
* fix lint issue
* enable monotone and interaction constraints
* don't specify default for monotone and interactions
* update docs
2019-02-05 05:12:53 -08:00
Nan Zhu
0d0ce32908
[jvm-packages] adding logs for parameters ( #4091 )
2019-01-30 21:50:55 -08:00
KyleLi1985
dade7c3aff
[jvm-packages] Performance consideration and Alignment input parameter of repartition function ( #4049 )
2019-01-07 08:38:05 -08:00
Nan Zhu
773ddbcfcb
[BLOCKING] fix the issue with infrequent feature ( #4045 )
...
* fix the issue with infrequent feature
* handle exception
* use only 2 workers
* address the comments
2019-01-06 16:01:03 -08:00
Nan Zhu
e290ec9a80
[jvm-packages] fix safe execution ( #4046 )
2019-01-05 19:45:37 -08:00
Nan Zhu
f368d0de2b
[jvm-packages] fix the scalability issue of prediction ( #4033 )
2018-12-29 20:46:30 -08:00
Nan Zhu
c055a32609
[jvm-packages]support multiple validation datasets in Spark ( #3910 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* wrap iterators
* enable copartition training and validationset
* add parameters
* converge code path and have init unit test
* enable multi evals for ranking
* unit test and doc
* update example
* fix early stopping
* address the offline comments
* udpate doc
* test eval metrics
* fix compilation issue
* fix example
2018-12-17 21:03:57 -08:00
Huafeng Wang
42cac4a30b
[jvm-packages] Fix vector size of 'rawPredictionCol' in XGBoostClassificationModel ( #3932 )
...
* Fix vector size of 'rawPredictionCol' in XGBoostClassificationModel
* Fix UT
2018-11-23 21:09:43 -08:00
Nan Zhu
dc2bfbfde1
[jvm-packages] update version to 0.82-SNAPSHOT ( #3920 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* update version
* 0.82
2018-11-18 16:47:48 -08:00
Nan Zhu
aa48b7e903
[jvm-packages][refactor] refactor XGBoost.scala (spark) ( #3904 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* wrap iterators
* remove unused code
* refactor
* fix typo
2018-11-15 20:38:28 -08:00
Philip Hyunsu Cho
78ec77fa97
Release 0.81 version ( #3864 )
...
* Release 0.81 version
* Update NEWS.md
2018-11-04 05:49:11 -08:00
Nan Zhu
4ae225a08d
[Blocking][jvm-packages] fix the early stopping feature ( #3808 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* temp
* add method for classifier and regressor
* update tutorial
* address the comments
* update
2018-10-23 14:53:13 -07:00
weitian
9504f411c1
[jvm-packages] For training data with group, empty RDD partition threw exception ( #3749 ) ( #3750 )
2018-10-09 09:03:22 -07:00
Nan Zhu
785094db53
[jvm-packages] fix issue when spark job execution thread cannot return before we execute first() ( #3758 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* fix scalastyle error
* sparjJobThread
* update
* fix issue when spark job execution thread cannot return before we execute first()
2018-10-05 22:20:50 -07:00
weitian
efc4f85505
[jvm-packages] Fix #3489 : Spark repartitionForData can potentially shuffle all data and lose ordering required for ranking objectives ( #3654 )
2018-10-03 08:43:55 -07:00
Sergei Lebedev
87aca8c244
[jvm-packages] Fixed the distributed updater check ( #3739 )
...
The updater used in distributed training is grow_histmaker and not
grow_colmaker as the error message stated prior to this commit.
2018-10-01 11:22:01 -07:00
Michael Mui
20a9e716bd
[jvm-packages] Fix "obj_type" error to enable custom objectives and evaluations ( #3646 )
...
credits to @mmui
2018-09-14 12:06:33 -07:00
Jerry Lin
9acd549dc7
[jvm-packages] Add rank:ndcg and rank:map to Spark supported objectives ( #3697 )
2018-09-13 09:51:24 -07:00
Nan Zhu
d1e75d615e
[jvm-packages] Remove copy paste error in test suite ( #3692 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* remove copy paste error
2018-09-11 13:08:36 -07:00
Joseph Bradley
14a8b96476
[jvm-packages] xgboost-spark warning when Spark encryption is turned on ( #3667 )
...
* added test, commented out right now
* reinstated test
* added fix for checking encryption settings
* fix by using RDD conf
* fix compilation
* renamed conf
* use SparkSession if available
* fix message
* nop
* code review fixes
2018-09-10 14:21:01 -07:00
Matthew Tovbin
beab6e08dd
Remove println in jsonDecode ( #3665 )
...
Following issue #3578
2018-09-07 15:47:26 -07:00
Nan Zhu
3261002099
[jvm-packages] throw ControlThrowable instead of InterruptedException ( #3632 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* interrupted exception is not rethrown
2018-08-25 20:30:21 -07:00
Nan Zhu
4912c1f9c6
[jvm-packages] fix checkpoint save/load ( #3614 )
...
* add back train method but mark as deprecated
* add back train method but mark as deprecated
* fix scalastyle error
* fix scalastyle error
* fix update checkpoint func
2018-08-21 12:34:24 -07:00
Matthew Tovbin
b53a5a262c
[jvm-packages] getTreeLimit return type should be Int
2018-08-17 09:36:00 -07:00