xgboost

Author	SHA1	Message	Date
Nan Zhu	016ab89484	[jvm-packages] Parameter tuning tool for XGBoost (#1664 )	2016-10-23 16:58:18 -04:00
Adam Pocock	445029bb82	[jvm-packages] XGBoost4j Windows fixes (#1639 ) * Changes for Mingw64 compilation to ensure long is a consistent size. Mainly impacts the Java API which would not compile, but there may be silent errors on Windows with large datasets before this patch (as long is 32-bits when compiled with mingw64 even in 64-bit mode). * Adding ifdefs to ensure it still compiles on MacOS * Makefile and create_jni.bat changes for Windows. * Switching XGDMatrixCreateFromCSREx JNI call to use size_t cast * Fixing lint error, adding profile switching to jvm-packages build to make create-jni.bat get called, adding myself to Contributors.Md	2016-10-18 08:35:25 -04:00
Nan Zhu	f5c776f64f	[jvm-packages] add apache maven repo url and bump up default spark version to 2.0.1 (#1650 ) * add apache maven repo url and bump up default spark version to 2.0.1	2016-10-13 08:55:03 -04:00
Nan Zhu	813a53882a	[jvm-packages] deprecate Flaky test (#1662 ) * deprecate flaky test	2016-10-13 07:21:24 -04:00
Nan Zhu	1673bcbe7e	[jvm-packages] separate classification and regression model and integrate with ML package (#1608 )	2016-09-30 11:49:03 -04:00
Nan Zhu	37bc122c90	[jvm-packages] Robust dmatrix creation (#1613 ) * add back train method but mark as deprecated * robust matrix creation in jvm	2016-09-26 13:35:04 -04:00
reg.zhuce	3ee145b8dc	[jvm-packages] IndexOutOfBoundsException (#1589 ) ml.dmlc.xgboost4j.scala.spark.XGBoost.scala:51 values is empty when we meet it at first time, so values(0) throw an IndexOutOfBoundsException. It should be dVector.values(i) instead of values(i).	2016-09-20 09:13:47 -04:00
Xin Yin	7245145712	[jvm-packages] Fixed the sanity check for parameter 'nthread' against 'spark.task.cpus'. (#1582 )	2016-09-16 11:31:35 -04:00
Nan Zhu	4ad648e856	[jvm-packages] predictLeaf with Dataframe (#1576 ) * add back train method but mark as deprecated * predictLeaf with Dataset * fix * fix	2016-09-15 06:15:47 -04:00
Nan Zhu	bb388cbb31	default eval func (#1574 )	2016-09-14 13:26:16 -04:00
Nan Zhu	fb02797e2a	[jvm-packages] Integration with Spark Dataframe/Dataset (#1559 ) * bump up to scala 2.11 * framework of data frame integration * test consistency between RDD and DataFrame * order preservation * test order preservation * example code and fix makefile * improve type checking * improve APIs * user docs * work around travis CI's limitation on log length * adjust test structure * integrate with Spark -1 .x * spark 2.x integration * remove spark 1.x implementation but provide instructions on how to downgrade	2016-09-11 15:02:58 -04:00
Nan Zhu	6dabdd33e3	[jvm-packages] bump to next version (#1535 ) * bump to next version * fix * fix	2016-09-01 12:18:21 -04:00
Nan Zhu	7fb3fbf577	impose shuffle when creating training RDD (#1531 )	2016-08-31 07:34:10 -04:00
Nan Zhu	3f198b9fef	[jvm-packages] allow training with missing values in xgboost-spark (#1525 ) * allow training with missing values in xgboost-spark * fix compilation error * fix bug	2016-08-29 21:45:49 -04:00
Nan Zhu	74db1e8867	[jvm-packages] remove APIs with DMatrix from xgboost-spark (#1519 ) * test consistency of prediction functions between DMatrix and RDD * remove APIs with DMatrix from xgboost-spark * fix compilation error in xgboost4j-example * fix test cases	2016-08-28 21:25:49 -04:00
Nan Zhu	6d65aae091	[jvm-packages] test consistency of prediction functions with DMatrix and RDD (#1518 ) * test consistency of prediction functions between DMatrix and RDD * fix the failed test cases	2016-08-28 20:27:03 -04:00
Nan Zhu	d7f79255ec	improve test of save/load model (#1515 )	2016-08-27 17:16:22 -04:00
Nan Zhu	dc1125eb56	evaluation with RDD data (#1492 )	2016-08-20 18:31:10 -04:00
Nan Zhu	582ee63e34	enable train multiple models by distinguishing stage IDs (#1493 )	2016-08-20 16:37:07 -04:00
Nan Zhu	70432cac5b	make IEvaluation serializable (#1487 )	2016-08-19 13:12:39 -04:00
Fangzhou	a8adf16228	fix bug: doing rabit call after finalize in spark prediction phase (#1420 )	2016-07-28 23:11:20 -05:00
Earthson Lu	d29edc677c	fix #1377 spark-mllib scope: default => provided (#1381 )	2016-07-20 23:10:49 -04:00
convexquad	313764b3be	Expose predictLeaf functionality in Scala XGBoostModel (#1351 )	2016-07-12 06:55:24 -04:00
Rahul	f14c160f4f	[jvm-packages][xgboost4j-spark][Minor] Move sparkContext dependency from the XGBoostModel (#1335 ) * Move sparkContext dependency from the XGBoostModel * Update Spark example to declare SparkContext as implict	2016-07-08 06:43:33 -04:00
Muhammad Haseeb Tariq	7533191af7	Typos in README (#1326 ) * Inconsistency in libsvm formats * note on libsvm formats * typos in README * Update README.md * Update README.md * Update README.md	2016-07-03 15:14:35 -04:00
Muhammad Haseeb Tariq	14f9697025	Inconsistency in libsvm formats (#1325 ) * Inconsistency in libsvm formats * note on libsvm formats	2016-07-03 10:49:41 -07:00
Nan Zhu	bd5b07873e	[jvm-packages] create dmatrix with specified missing value (#1272 ) * create dmatrix with specified missing value * update dmlc-core * support for predict method in spark package repartitioning work around * add more elements to work around training set empty partition issue	2016-06-21 17:35:17 -04:00
Nan Zhu	c9a73fe2a9	explicitly throw exception when detecting empty partition in training dataset (#1281 )	2016-06-15 16:03:37 -04:00
Nan Zhu	c6631ad2ed	specify spark version (#1224 )	2016-05-24 18:19:32 -04:00
Nan Zhu	c85b9012c6	[jvm-packages] xgboost4j-spark external memory (#1219 ) * implement external memory support for XGBoost4J * remove extra space * enable external memory for prediction * update doc	2016-05-22 14:01:28 -04:00
CodingCat	d8535313eb	allow empty partitions	2016-03-23 12:30:06 -04:00
CodingCat	55ab1c6a22	adjust numWorkers for test	2016-03-18 10:34:36 -04:00
CodingCat	a31a978471	run native lib building command from maven	2016-03-16 16:47:08 -04:00
tqchen	90f7220736	[FLINK] remove nWorker from API	2016-03-14 16:18:35 -07:00
CodingCat	3a951d0ab8	getter of XGBoostModel	2016-03-14 07:26:51 -04:00
Nan Zhu	e3fa7753f5	Merge branch 'master' into master	2016-03-13 22:46:38 -04:00
CodingCat	6f92f1c117	update spark version to 1.6.1	2016-03-13 22:46:06 -04:00
CodingCat	f2ef958ebb	support kryo serialization	2016-03-13 11:55:14 -04:00
CodingCat	9011acf52b	jvm doc index	2016-03-13 09:20:51 -04:00
CodingCat	16b9e92328	force the user to set number of workers	2016-03-12 13:33:57 -05:00
CodingCat	5f441a29a8	set nthread to spark.task.cpus by default	2016-03-11 20:07:09 -05:00
CodingCat	a3b2e76230	update README for jvm-packages	2016-03-11 15:28:55 -05:00
CodingCat	400b1faecc	adjust the API signature as well as the docs	2016-03-11 15:22:44 -05:00
CodingCat	ab68a0ccc7	fix examples	2016-03-11 13:57:03 -05:00
CodingCat	aca0096b33	more updates for Flink more fix	2016-03-11 10:15:49 -05:00
CodingCat	43d7a85bc9	change the API name since we support not only HDFS and local file system	2016-03-11 10:05:32 -05:00
Shaform	6558ef3273	support different types of filesystems	2016-03-11 22:06:40 +08:00
CodingCat	51b0e7010c	fix create_jni sh	2016-03-11 08:46:44 -05:00
CodingCat	d47df5c1d8	allow the user to specify the worker number and avoid unnecessary shuffle	2016-03-10 06:58:30 -05:00
CodingCat	e0a3f1c000	nthread no larger than spark.task.cpus	2016-03-10 05:51:07 -05:00

1 2

100 Commits