xgboost

Author	SHA1	Message	Date
Bobby Wang	6275cdc486	[jvm-packages] add format option when saving a model (#7940 )	2022-05-30 15:49:59 +08:00
Bobby Wang	fbc3d861bb	[jvm-packages] remove default parameters (#7938 )	2022-05-28 10:31:19 +08:00
Daniel Clausen	755d9d4609	[JVM-Packages] Auto-detection of MUSL is replaced by system properties (#7921 ) This PR removes auto-detection of MUSL-based Linux systems in favor of system properties the user can set to configure a specific path for a native library.	2022-05-26 10:53:15 +08:00
Bobby Wang	5ef33adf68	[jvm-packges] set the correct objective if user doesn't explicitly set it (#7781 )	2022-05-18 14:05:18 +08:00
Bobby Wang	b41cf92dc2	[jvm-packages] move dmatrix building into rabit context for cpu pipeline (#7908 )	2022-05-17 14:52:25 +08:00
Bobby Wang	11e46e4bc0	[Breaking][jvm-packages] make classification model be xgboost-compatible (#7896 )	2022-05-14 15:43:05 +08:00
Bobby Wang	9fa7ed1743	[Breaking][jvm-packages] remove timeoutRequestWorkers parameter (#7839 )	2022-05-13 16:26:25 +08:00
Michael Allman	f7db16add1	Ignore all Java exceptions when looking for Linux musl support (#7844 )	2022-04-28 15:44:30 +08:00
Bobby Wang	a94e1b172e	[jvm-packages] Fix model compatibility (#7845 )	2022-04-28 02:05:38 +08:00
Bobby Wang	686caad40c	[jvm-package] remove the coalesce in barrier mode (#7846 )	2022-04-27 23:34:22 +08:00
Bobby Wang	dc2e699656	[Breaking][jvm-packages] Use barrier execution mode (#7836 ) With the introduction of the barrier execution mode. we don't need to kill SparkContext when some xgboost tasks failed. Instead, Spark will handle the errors for us. So in this PR, `killSparkContextOnWorkerFailure` parameter is deleted.	2022-04-25 17:09:52 +08:00
Bobby Wang	c45665a55a	[jvm-packages] move the dmatrix building into rabit context (#7823 ) This fixes the QuantileDeviceDMatrix in distributed environment.	2022-04-23 00:06:50 +08:00
Bobby Wang	2d83b2ad8f	[jvm-packages] add hostIp and python exec for rabit tracker (#7808 )	2022-04-15 16:28:43 +08:00
dependabot[bot]	1bb1913811	Bump hadoop-common from 2.10.1 to 3.2.3 in /jvm-packages/xgboost4j-flink (#7801 ) Bumps hadoop-common from 2.10.1 to 3.2.3. --- updated-dependencies: - dependency-name: org.apache.hadoop:hadoop-common dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-04-13 22:24:44 +08:00
Bobby Wang	3f536b5308	[jvm-packages] fix evaluation when featuresCols is used (#7798 )	2022-04-13 12:52:50 +08:00
Bobby Wang	118192f116	[jvm-packages] xgboost4j-spark should work when featuresCols is specified (#7789 )	2022-04-08 13:21:04 +08:00
Bobby Wang	729d227b89	[jvm-packages] remove the dep of com.fasterxml.jackson (#7791 )	2022-04-08 13:04:34 +08:00
Bobby Wang	2454407f3a	[jvm-packages] unify setFeaturesCol API for XGBoostRegressor (#7784 )	2022-04-05 13:35:33 +08:00
Jiaming Yuan	522636cb52	Bump version. (#7769 )	2022-03-31 06:33:22 +08:00
Oleksandr Pryimak	f5b20286e2	[jvm-packages] Launch dev jvm image under my user (#4676 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-03-23 10:39:51 -07:00
Aging	f20ffa8db3	Update JVM dev build Dockerfile and shell script (#6792 ) Co-authored-by: Zhuo Yuzhen <yuzhuo@paypal.com>	2022-03-22 16:39:10 -07:00
Daniel Clausen	4dafb5fac8	[JVM-Packages] Add support for detecting musl-based Linux (#7624 ) Co-authored-by: Marc Philipp <marc@gradle.com>	2022-03-14 00:37:27 +08:00
Bobby Wang	89aa8ddf52	[jvm-packages] fix the prediction issue for multi:softmax (#7694 )	2022-02-24 01:09:45 +08:00
Bobby Wang	e3e6de5ed9	[jvm-packages] unify the set features API (#7692 ) xgboost4j-spark provides 2 sets of API for setting features, one for CPU, another for GPU, which may cause confusion. This PR removes the GPU API and adds an override CPU function setFeaturesCol to accept Array[String] parameters.	2022-02-23 03:37:25 +08:00
Bobby Wang	131858e7cb	[jvm-packages] Do not repartition when nWorker = 1 (#7676 )	2022-02-19 21:45:54 +08:00
dependabot[bot]	87c01f49d8	Bump hadoop-common from 2.7.3 to 2.10.1 in /jvm-packages/xgboost4j-flink (#7641 ) Bumps hadoop-common from 2.7.3 to 2.10.1. --- updated-dependencies: - dependency-name: org.apache.hadoop:hadoop-common dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-02-09 17:07:35 -08:00
Jiaming Yuan	ac7a36367c	[jvm-packages] Implement new `save_raw` in jvm-packages. (#7570 ) * New `toByteArray` that accepts a parameter for format.	2022-01-19 16:00:14 +08:00
Jiaming Yuan	001503186c	Rewrite approx (#7214 ) This PR rewrites the approx tree method to use codebase from hist for better performance and code sharing. The rewrite has many benefits: - Support for both `max_leaves` and `max_depth`. - Support for `grow_policy`. - Support for mono constraint. - Support for feature weights. - Support for easier bin configuration (`max_bin`). - Support for categorical data. - Faster performance for most of the datasets. (many times faster) - Support for prediction cache. - Significantly better performance for external memory. - Unites the code base between approx and hist.	2022-01-10 21:15:05 +08:00
Jiaming Yuan	ed95e77752	[jvm-packages] Update JNI header. (#7550 )	2022-01-10 14:59:40 +08:00
Bobby Wang	e8c1eb99e4	[jvm-package] Clean up the legacy gpu support tests (#7523 )	2021-12-21 09:15:51 +08:00
Bobby Wang	24e25802a7	[jvm-packages] Add Rapids plugin support (#7491 ) * Add GPU pre-processing pipeline.	2021-12-17 13:11:12 +08:00
Bobby Wang	24be04e848	[jvm-packages] Add DeviceQuantileDMatrix to Scala binding (#7459 )	2021-11-24 20:23:18 +08:00
Bobby Wang	7cfb310eb4	Rework transform (#7440 ) extract the common part of transform code from XGBoostClassifier and XGBoostRegressor	2021-11-18 15:48:57 +08:00
Jiaming Yuan	55ee272ea8	Extend array interface to handle ndarray. (#7434 ) * Extend array interface to handle ndarray. The `ArrayInterface` class is extended to support multi-dim array inputs. Previously this class handles only 2-dim (vector is also matrix). This PR specifies the expected dimension at compile-time and the array interface can perform various checks automatically for input data. Also, adapters like CSR are more rigorous about their input. Lastly, row vector and column vector are handled without intervention from the caller.	2021-11-16 09:52:15 +08:00
Bobby Wang	cb685607b2	[jvm-packages] Rework the train pipeline (#7401 ) 1. Add PreXGBoost to build RDD[Watches] from Dataset 2. Feed RDD[Watches] built from PreXGBoost to XGBoost to train	2021-11-10 17:51:38 +08:00
Bobby Wang	b81ebbef62	[jvm-packages] Fix json4s binary compatibility issue (#7376 ) Spark 3.2 depends on 3.7.0-M11 which has changed some implicited functions' signatures. And it will result the xgboost4j built against spark 3.0/3.1 failed when saving the model.	2021-10-30 03:20:57 +08:00
nicovdijk	a6bcd54b47	[jvm-packages] Fix for space in sys.executable path in create_jni.py (#7358 )	2021-10-25 13:45:11 +08:00
nicovdijk	31a307cf6b	[XGBoost4J-Spark] Serialization for custom objective and eval (#7274 ) * added type hints to custom_obj and custom_eval for Spark persistence Co-authored-by: Bobby Wang <wbo4958@gmail.com>	2021-10-21 16:22:23 +08:00
nicovdijk	74bab6e504	Control logging for early stopping using shouldPrint() (#7326 )	2021-10-21 12:12:06 +08:00
Bobby Wang	4fd149b3a2	[jvm-packages] update checkstyle (#7335 ) * [jvm-packages] update scalastyle 1. bump scalastyle-maven-plugin and maven-checkstyle-plugin to latest 2. remove unused imports * fix code style check	2021-10-18 18:42:01 +08:00
Jiaming Yuan	f7caac2563	Bump version to 1.6.0 in master. (#7259 )	2021-10-07 16:09:26 +08:00
Jiaming Yuan	fbd58bf190	[jvm-packages] Create demo and test for xgboost4j early stopping. (#7252 )	2021-09-25 03:29:27 +08:00
Bobby Wang	0ee11dac77	[jvm-packages][xgboost4j-gpu] Support GPU dataframe and `DeviceQuantileDMatrix` (#7195 ) Following classes are added to support dataframe in java binding: - `Column` is an abstract type for a single column in tabular data. - `ColumnBatch` is an abstract type for dataframe. - `CuDFColumn` is an implementaiton of `Column` that consume cuDF column - `CudfColumnBatch` is an implementation of `ColumnBatch` that consumes cuDF dataframe. - `DeviceQuantileDMatrix` is the interface for quantized data. The Java implementation mimics the Python interface and uses `__cuda_array_interface__` protocol for memory indexing. One difference is on JVM package, the data batch is staged on the host as java iterators cannot be reset. Co-authored-by: jiamingy <jm.yuan@outlook.com>	2021-09-24 14:25:00 +08:00
Jiaming Yuan	9f63d6fead	[jvm-packages] Deprecate constructors with implicit missing value. (#7225 )	2021-09-17 04:35:04 +08:00
Martin Petříček	46c46829ce	Fix model loading from stream (#7067 ) Fix bug introduced in 17913713b554d820a8ce94226d854b4a5f1d8bbc (allow loading from byte array) When loading model from stream, only last buffer read from the input stream is used to construct the model. This may work for models smaller than 1 MiB (if you are lucky enough to read the whole model at once), but will always fail if the model is larger.	2021-08-15 21:04:33 +08:00
Jiaming Yuan	7017dd5a26	[JVM-Packages] Use Python tracker in XGBoost for JVM package. (#7132 )	2021-07-27 16:20:42 +08:00
naveenkb	9f7f8b976d	[XGBoost4J-Spark] bestIteration and bestScore for early stopping (#7095 )	2021-07-19 18:46:49 +08:00
Jiaming Yuan	663136aa08	Implement feature score for linear model. (#7048 ) * Add feature score support for linear model. * Port R interface to the new implementation. * Add linear model support in Python. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-06-25 14:34:02 +08:00
ShvetsKS	57c732655e	Merge lossgude and depthwise strategies for CPU hist (#7007 ) * fix java/scala test: max depth is also valid parameter for lossguide Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-06-03 01:49:43 +08:00
Adam Pocock	2320aa0da2	Making the Java library loader emit helpful error messages on missing dependencies. (#6926 )	2021-05-19 14:53:56 +08:00

... 2 3 4 5 6 ...

567 Commits