xgboost

Author	SHA1	Message	Date
Nan Zhu	ae3bb9c2d5	Distributed Fast Histogram Algorithm (#4011 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * allow hist algo * more changes * temp * update * remove hist sync * udpate rabit * change hist size * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * fix lint issue * broadcast subsampled feature correctly * revert some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * update docs	2019-02-05 05:12:53 -08:00
Nan Zhu	0d0ce32908	[jvm-packages] adding logs for parameters (#4091 )	2019-01-30 21:50:55 -08:00
Jiaming Yuan	301cef4638	Correct JVM CMake GPU flag. (#4071 )	2019-01-21 20:36:38 +08:00
KyleLi1985	dade7c3aff	[jvm-packages] Performance consideration and Alignment input parameter of repartition function (#4049 )	2019-01-07 08:38:05 -08:00
Nan Zhu	773ddbcfcb	[BLOCKING] fix the issue with infrequent feature (#4045 ) * fix the issue with infrequent feature * handle exception * use only 2 workers * address the comments	2019-01-06 16:01:03 -08:00
Nan Zhu	e290ec9a80	[jvm-packages] fix safe execution (#4046 )	2019-01-05 19:45:37 -08:00
Shayak Banerjee	431c850c03	[jvm-packages] Updates to Java Booster to support other feature importance measures (#3801 ) * Updates to Booster to support other feature importances * Add returns for Java methods * Pass Scala style checks * Pass Java style checks * Fix indents * Use class instead of enum * Return map string double * A no longer broken build, thanks to mvn package local build * Add a unit test to increase code coverage back * Address code review on main code * Add more unit tests for different feature importance scores * Address more CR	2019-01-02 01:13:14 -08:00
Nan Zhu	f368d0de2b	[jvm-packages] fix the scalability issue of prediction (#4033 )	2018-12-29 20:46:30 -08:00
Nan Zhu	c055a32609	[jvm-packages]support multiple validation datasets in Spark (#3910 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * wrap iterators * enable copartition training and validationset * add parameters * converge code path and have init unit test * enable multi evals for ranking * unit test and doc * update example * fix early stopping * address the offline comments * udpate doc * test eval metrics * fix compilation issue * fix example	2018-12-17 21:03:57 -08:00
Nan Zhu	9c4ff50e83	[jvm-packages]Fix early stopping condition (#3928 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * update version * 0.82 * fix early stopping condition * remove unused * update comments * udpate comments * update test	2018-11-24 00:18:07 -08:00
Huafeng Wang	42cac4a30b	[jvm-packages] Fix vector size of 'rawPredictionCol' in XGBoostClassificationModel (#3932 ) * Fix vector size of 'rawPredictionCol' in XGBoostClassificationModel * Fix UT	2018-11-23 21:09:43 -08:00
Philip Hyunsu Cho	86aac98e54	[jvm-packages] Fix #3898 : use correct group ID for maven-site-plugin (#3937 )	2018-11-23 09:46:27 -08:00
Nan Zhu	dc2bfbfde1	[jvm-packages] update version to 0.82-SNAPSHOT (#3920 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * update version * 0.82	2018-11-18 16:47:48 -08:00
Nan Zhu	aa48b7e903	[jvm-packages][refactor] refactor XGBoost.scala (spark) (#3904 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * wrap iterators * remove unused code * refactor * fix typo	2018-11-15 20:38:28 -08:00
ajing	0ddb8a7661	Update README.md (#3872 ) SparkWithDataFrame was not there anymore. So replace with SparkMLlibPipeline.scala	2018-11-12 11:03:13 -08:00
Philip Hyunsu Cho	78ec77fa97	Release 0.81 version (#3864 ) * Release 0.81 version * Update NEWS.md	2018-11-04 05:49:11 -08:00
Philip Hyunsu Cho	2febc105a4	[jvm-packages] Fix JVM doc build (#3853 ) To get around of the bug https://issues.apache.org/jira/browse/SUREFIRE-1588, set useSystemClassLoader=false.	2018-11-01 15:16:08 -07:00
Matthew Tovbin	d81fedb955	[jvm-packages] RabitTracker for Scala: allow specifying host ip from the xgboost-tracker.properties file (#3833 )	2018-10-26 22:01:36 -07:00
Nan Zhu	4ae225a08d	[Blocking][jvm-packages] fix the early stopping feature (#3808 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * temp * add method for classifier and regressor * update tutorial * address the comments * update	2018-10-23 14:53:13 -07:00
Philip Hyunsu Cho	e26b5d63b2	[jvm-packages] Upgrade Scala to 2.11.12 to address CVE-2017-15288 (#3816 ) A privilege escalation vulnerability (CVE-2017-15288) has been identified in the Scala compilation daemon. See https://nvd.nist.gov/vuln/detail/CVE-2017-15288 Fix: Upgrade Scala to 2.11.12.	2018-10-22 10:15:30 -07:00
weitian	9504f411c1	[jvm-packages] For training data with group, empty RDD partition threw exception (#3749 ) (#3750 )	2018-10-09 09:03:22 -07:00
Nan Zhu	785094db53	[jvm-packages] fix issue when spark job execution thread cannot return before we execute first() (#3758 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * sparjJobThread * update * fix issue when spark job execution thread cannot return before we execute first()	2018-10-05 22:20:50 -07:00
zengxy	9e73087324	[jvm-packages] support specified feature names when getModelDump and getFeatureScore (#3733 ) * [jvm-packages] support specified feature names for jvm when get ModelDump and get FeatureScore (#3725) * typo and style fix	2018-10-04 09:05:42 -07:00
weitian	efc4f85505	[jvm-packages] Fix #3489 : Spark repartitionForData can potentially shuffle all data and lose ordering required for ranking objectives (#3654 )	2018-10-03 08:43:55 -07:00
Sergei Lebedev	87aca8c244	[jvm-packages] Fixed the distributed updater check (#3739 ) The updater used in distributed training is grow_histmaker and not grow_colmaker as the error message stated prior to this commit.	2018-10-01 11:22:01 -07:00
Nan Zhu	79d854c695	[jvm-packages] fix errors in example (#3719 ) * add back train method but mark as deprecated * fix scalastyle error * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * instrumentation * use log console * better measurement * fix erros in example * update histmaker	2018-09-22 16:39:38 -07:00
Nan Zhu	aa53e9fc8d	[jvm-packages] bump spark version (#3709 )	2018-09-19 11:18:01 -07:00
Michael Mui	20a9e716bd	[jvm-packages] Fix "obj_type" error to enable custom objectives and evaluations (#3646 ) credits to @mmui	2018-09-14 12:06:33 -07:00
Jerry Lin	9acd549dc7	[jvm-packages] Add rank:ndcg and rank:map to Spark supported objectives (#3697 )	2018-09-13 09:51:24 -07:00
Chen Qin	42b108136f	[jvm-packages] bump flink version number (#3686 ) * bump flink version number * bump flink version number * add missing hadoop dependency	2018-09-13 09:33:09 -07:00
Nan Zhu	d1e75d615e	[jvm-packages] Remove copy paste error in test suite (#3692 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * remove copy paste error	2018-09-11 13:08:36 -07:00
Joseph Bradley	14a8b96476	[jvm-packages] xgboost-spark warning when Spark encryption is turned on (#3667 ) * added test, commented out right now * reinstated test * added fix for checking encryption settings * fix by using RDD conf * fix compilation * renamed conf * use SparkSession if available * fix message * nop * code review fixes	2018-09-10 14:21:01 -07:00
Matthew Tovbin	beab6e08dd	Remove println in jsonDecode (#3665 ) Following issue #3578	2018-09-07 15:47:26 -07:00
Nan Zhu	3261002099	[jvm-packages] throw ControlThrowable instead of InterruptedException (#3632 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * interrupted exception is not rethrown	2018-08-25 20:30:21 -07:00
Nan Zhu	4912c1f9c6	[jvm-packages] fix checkpoint save/load (#3614 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix update checkpoint func	2018-08-21 12:34:24 -07:00
Matthew Tovbin	b53a5a262c	[jvm-packages] getTreeLimit return type should be Int	2018-08-17 09:36:00 -07:00
Nan Zhu	73bd590a1d	[jvm-packages] add the missing scm urls (#3589 ) for some reason this part was missing in master branch????	2018-08-14 15:05:23 -07:00
Matthew Tovbin	2b7a1c5780	[jvm-packages] Avoid loosing precision when computing probabilities by converting to Double early (#3576 )	2018-08-13 14:05:07 -07:00
Matthew Tovbin	ce0f0568a6	Make sure 'thresholds' are considered when executing predict method (#3577 )	2018-08-13 14:04:47 -07:00
Philip Hyunsu Cho	6288f6d563	Update JVM packages version to 0.81-SNAPSHOT (#3584 )	2018-08-13 10:17:52 -07:00
Philip Hyunsu Cho	96826a3515	Release version 0.80 (#3541 ) * Up versions * Write release note for 0.80	2018-08-13 01:38:37 -07:00
Mathew	06ef4db4cc	Fix Spark 2.2 Support (Amending #3062 ) (#3325 ) This pull request amends the broken #3062 allow Spark 2.2 to work. Please note this won't work in Spark <=2.1 as sc.removeSparkListener was implemented in Spark 2.2. (So perhaps a more general method is better, although that is what was attempted in #3062) This PR fixes: #3208, #3151 and the discussion in #1927. I do find it strange that #3062 dose not work in Spark 2.2, it's probably due to some sort of public/private issue in the org.apache.spark.scheduler.LiveListenerBus class inheritance (In Spark itself). The error is: `java.lang.NoSuchMethodError: org.apache.spark.scheduler.LiveListenerBus.removeListener(Ljava/lang/Object;)V`	2018-08-12 18:35:20 -07:00
Matthew Tovbin	7300002516	[jvm-packages] Use treeLimit param in getTreeLimit (#3575 )	2018-08-10 09:38:58 -07:00
Philip Hyunsu Cho	aa4ee6a0e4	[BLOCKING] Adding JVM doc build to Jenkins CI (#3567 ) * Adding Java/Scala doc build to Jenkins CI * Deploy built doc to S3 bucket * Build doc only for branches * Build doc first, to get doc faster for branch updates * Have ReadTheDocs download doc tarball from S3 * Update JVM doc links * Put doc build commands in a script * Specify Spark 2.3+ requirement for XGBoost4J-Spark * Build GPU wheel without NCCL, to reduce binary size	2018-08-09 13:27:01 -07:00
Matthew Tovbin	bad76048d1	Eliminate use of System.out + proper error logging (#3572 )	2018-08-09 10:06:17 -07:00
Nan Zhu	1c08b3b2ea	[jvm-packages] enable predictLeaf/predictContrib/treeLimit in 0.8 (#3532 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * partial finish * no test * add test cases * add test cases * address comments * add test for regressor * fix typo	2018-08-07 14:01:18 -07:00
Philip Hyunsu Cho	4a429a7c4f	Add reg:tweedie to supported objectives in XGBoost4J-Spark (#3552 )	2018-08-05 07:42:59 -07:00
Nan Zhu	31d1baba3d	[jvm-packages] Tutorial of XGBoost4J-Spark (#3534 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * add new * update doc * finish Gang Scheduling * more * intro * Add sections: Prediction, Model persistence and ML pipeline. * Add XGBoost4j-Spark MLlib pipeline example * partial finished version * finish the doc * adjust code * fix the doc * use rst * Convert XGBoost4J-Spark tutorial to reST * Bring XGBoost4J up to date * add note about using hdfs * remove duplicate file * fix descriptions * update doc * Wrap HDFS/S3 export support as a note * update * wrap indexing_mode example in code block	2018-08-03 21:17:50 -07:00
Nan Zhu	6cf97b4eae	[jvm-packages] consider spark.task.cpus when controlling parallelism (#3530 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * consider spark.task.cpus when controlling parallelism * fix bug * fix conf setup * calculate requestedCores within ParallelismController * enforce spark.task.cpus = 1 * unify unit test case framework * enable spark ui	2018-07-31 06:19:45 -07:00
Nan Zhu	b546321c83	[jvm-packages] the current version of xgboost does not consider missing value in prediction (#3529 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * consider missing value in prediction * handle single prediction instance * fix type conversion	2018-07-30 14:16:24 -07:00

1 2 3 4 5 ...

348 Commits