xgboost

Author	SHA1	Message	Date
Bobby Wang	24e25802a7	[jvm-packages] Add Rapids plugin support (#7491 ) * Add GPU pre-processing pipeline.	2021-12-17 13:11:12 +08:00
Bobby Wang	24be04e848	[jvm-packages] Add DeviceQuantileDMatrix to Scala binding (#7459 )	2021-11-24 20:23:18 +08:00
Bobby Wang	7cfb310eb4	Rework transform (#7440 ) extract the common part of transform code from XGBoostClassifier and XGBoostRegressor	2021-11-18 15:48:57 +08:00
Jiaming Yuan	55ee272ea8	Extend array interface to handle ndarray. (#7434 ) * Extend array interface to handle ndarray. The `ArrayInterface` class is extended to support multi-dim array inputs. Previously this class handles only 2-dim (vector is also matrix). This PR specifies the expected dimension at compile-time and the array interface can perform various checks automatically for input data. Also, adapters like CSR are more rigorous about their input. Lastly, row vector and column vector are handled without intervention from the caller.	2021-11-16 09:52:15 +08:00
Bobby Wang	cb685607b2	[jvm-packages] Rework the train pipeline (#7401 ) 1. Add PreXGBoost to build RDD[Watches] from Dataset 2. Feed RDD[Watches] built from PreXGBoost to XGBoost to train	2021-11-10 17:51:38 +08:00
Bobby Wang	b81ebbef62	[jvm-packages] Fix json4s binary compatibility issue (#7376 ) Spark 3.2 depends on 3.7.0-M11 which has changed some implicited functions' signatures. And it will result the xgboost4j built against spark 3.0/3.1 failed when saving the model.	2021-10-30 03:20:57 +08:00
nicovdijk	a6bcd54b47	[jvm-packages] Fix for space in sys.executable path in create_jni.py (#7358 )	2021-10-25 13:45:11 +08:00
nicovdijk	31a307cf6b	[XGBoost4J-Spark] Serialization for custom objective and eval (#7274 ) * added type hints to custom_obj and custom_eval for Spark persistence Co-authored-by: Bobby Wang <wbo4958@gmail.com>	2021-10-21 16:22:23 +08:00
nicovdijk	74bab6e504	Control logging for early stopping using shouldPrint() (#7326 )	2021-10-21 12:12:06 +08:00
Bobby Wang	4fd149b3a2	[jvm-packages] update checkstyle (#7335 ) * [jvm-packages] update scalastyle 1. bump scalastyle-maven-plugin and maven-checkstyle-plugin to latest 2. remove unused imports * fix code style check	2021-10-18 18:42:01 +08:00
Jiaming Yuan	f7caac2563	Bump version to 1.6.0 in master. (#7259 )	2021-10-07 16:09:26 +08:00
Jiaming Yuan	fbd58bf190	[jvm-packages] Create demo and test for xgboost4j early stopping. (#7252 )	2021-09-25 03:29:27 +08:00
Bobby Wang	0ee11dac77	[jvm-packages][xgboost4j-gpu] Support GPU dataframe and `DeviceQuantileDMatrix` (#7195 ) Following classes are added to support dataframe in java binding: - `Column` is an abstract type for a single column in tabular data. - `ColumnBatch` is an abstract type for dataframe. - `CuDFColumn` is an implementaiton of `Column` that consume cuDF column - `CudfColumnBatch` is an implementation of `ColumnBatch` that consumes cuDF dataframe. - `DeviceQuantileDMatrix` is the interface for quantized data. The Java implementation mimics the Python interface and uses `__cuda_array_interface__` protocol for memory indexing. One difference is on JVM package, the data batch is staged on the host as java iterators cannot be reset. Co-authored-by: jiamingy <jm.yuan@outlook.com>	2021-09-24 14:25:00 +08:00
Jiaming Yuan	9f63d6fead	[jvm-packages] Deprecate constructors with implicit missing value. (#7225 )	2021-09-17 04:35:04 +08:00
Martin Petříček	46c46829ce	Fix model loading from stream (#7067 ) Fix bug introduced in `17913713b5` (allow loading from byte array) When loading model from stream, only last buffer read from the input stream is used to construct the model. This may work for models smaller than 1 MiB (if you are lucky enough to read the whole model at once), but will always fail if the model is larger.	2021-08-15 21:04:33 +08:00
Jiaming Yuan	7017dd5a26	[JVM-Packages] Use Python tracker in XGBoost for JVM package. (#7132 )	2021-07-27 16:20:42 +08:00
naveenkb	9f7f8b976d	[XGBoost4J-Spark] bestIteration and bestScore for early stopping (#7095 )	2021-07-19 18:46:49 +08:00
Jiaming Yuan	663136aa08	Implement feature score for linear model. (#7048 ) * Add feature score support for linear model. * Port R interface to the new implementation. * Add linear model support in Python. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-06-25 14:34:02 +08:00
ShvetsKS	57c732655e	Merge lossgude and depthwise strategies for CPU hist (#7007 ) * fix java/scala test: max depth is also valid parameter for lossguide Co-authored-by: Kirill Shvets <kirill.shvets@intel.com>	2021-06-03 01:49:43 +08:00
Adam Pocock	2320aa0da2	Making the Java library loader emit helpful error messages on missing dependencies. (#6926 )	2021-05-19 14:53:56 +08:00
Andrew Ziem	3e7e426b36	Fix spelling in documents (#6948 ) * Update roxygen2 doc. Co-authored-by: fis <jm.yuan@outlook.com>	2021-05-11 20:44:36 +08:00
Philip Hyunsu Cho	ec6ce08cd0	[jvm-packages] Make it easier to release GPU/CPU code artifacts to Maven Central (#6940 )	2021-05-04 14:00:03 -07:00
Jiaming Yuan	74b41637de	Revert "[jvm-packages] Add `XGBOOST_RABIT_TRACKER_IP_FOR_TEST` to set rabit tracker IP. (#6869 )" (#6886 ) This reverts commit `2828da3c4c`.	2021-04-21 11:20:10 -07:00
Bobby Wang	2828da3c4c	[jvm-packages] Add `XGBOOST_RABIT_TRACKER_IP_FOR_TEST` to set rabit tracker IP. (#6869 ) * Add `XGBOOST_RABIT_TRACKER_IP_FOR_TEST` to set rabit tracker IP * change spark and rabit tracker IP to 127.0.0.1on GitHub Action. Co-authored-by: fis <jm.yuan@outlook.com>	2021-04-22 02:00:22 +08:00
Jiaming Yuan	146549260a	Bump version to 1.5.0 snapshot in master. (#6875 )	2021-04-22 01:53:44 +08:00
Bobby Wang	2c684ffd32	[jvm-packages] fix "key not found: train" issue (#6842 ) * [jvm-packages] fix "key not found: train" issue * fix bug	2021-04-18 23:28:39 -07:00
Viktor Szathmáry	b65e3c4444	[jvm] reduce scala-compiler, scalatest dependency scopes (#6730 ) * [jvm] reduce scala-compiler, scalatest dependency scopes * [jvm] workaround for GpuTestSuite scalatest dependency * scalatest scope tweak	2021-04-07 15:22:08 -07:00
Bobby Wang	49c22c23b4	[jvm-packages] fix early stopping doesn't work even without custom_eval setting (#6738 ) * [jvm-packages] fix early stopping doesn't work even without custom_eval setting * remove debug info * resolve comment	2021-03-06 20:19:40 -08:00
Honza Sterba	17913713b5	[jvm] Add ability to load booster direct from byte array (#6655 ) * Add ability to load booster direct from byte array * fix compiler error * move InputStream to byte-buffer conversion - move it from Booster to XGBoost facade class	2021-02-23 11:28:27 -08:00
Adam Pocock	fec66d033a	[jvm-packages] JVM library loader extensions (#6630 ) * [java] extending the library loader to use both OS and CPU architecture. * Simplifying create_jni.py's architecture detection. * Tidying up the architecture detection in create_jni.py	2021-01-25 15:51:39 +08:00
Bobby Wang	9d2832a3a3	fix potential TaskFailedListener's callback won't be called (#6612 ) there is possibility that onJobStart of TaskFailedListener won't be called, if the job is submitted before the other thread adds addSparkListener. detail can be found at https://github.com/dmlc/xgboost/pull/6019#issuecomment-760937628	2021-01-21 14:20:32 +08:00
Philip Hyunsu Cho	0d483cb7c1	Bump version to 1.4.0 snapshot in master (#6486 )	2020-12-10 07:38:08 -08:00
zhang_jf	cc581b3b6b	Misleading exception information: no such param of "allow_non_zero_missing" (#6418 )	2020-11-20 19:33:34 +08:00
Nan Zhu	4d1d5d4010	[jvm-packages] fix potential unit test suites aborted issue (#6373 ) * fix race conditio * code cleaning rm pom.xml-e * clean again * fix compilation issue * recover * avoid using getOrCreate * interrupt zombie threads * safe guard * fix deadlock * Update SparkParallelismTracker.scala	2020-11-17 10:59:26 -08:00
Jiaming Yuan	dfac5f89e9	Group CLI demo into subdirectory. (#6258 ) CLI is not most developed interface. Putting them into correct directory can help new users to avoid it as most of the use cases are from a language binding.	2020-10-28 14:40:44 -07:00
Jiaming Yuan	b180223d18	Cleanup RABIT. (#6290 ) * Remove recovery and MPI speed tests. * Remove readme. * Remove Python binding. * Add checks in C API.	2020-10-27 08:48:22 +08:00
Jiaming Yuan	d61b628bf5	Remove RABIT CMake targets. (#6275 ) * Now it's built as part of libxgboost. * Set correct C API error in RABIT initialization and finalization. * Remove redundant message. * Guard the tracker print C API.	2020-10-27 01:30:20 +08:00
Jiaming Yuan	b5c2a47b20	Drop single point model recovery (#6262 ) * Pass rabit params in JVM package. * Implement timeout using poll timeout parameter. * Remove OOB data check.	2020-10-21 15:27:03 +08:00
dependabot[bot]	06e453ddf4	Bump junit from 4.11 to 4.13.1 in /jvm-packages/xgboost4j (#6230 ) Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1. - [Release notes](https://github.com/junit-team/junit4/releases) - [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.11.md) - [Commits](https://github.com/junit-team/junit4/compare/r4.11...r4.13.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-10-13 19:46:19 -07:00
dependabot[bot]	b51a717deb	Bump junit from 4.11 to 4.13.1 in /jvm-packages/xgboost4j-gpu (#6233 ) Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1. - [Release notes](https://github.com/junit-team/junit4/releases) - [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.11.md) - [Commits](https://github.com/junit-team/junit4/compare/r4.11...r4.13.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-10-13 19:44:56 -07:00
Philip Hyunsu Cho	c991eb612d	[jvm-packages] Fix up build for xgboost4j-gpu, xgboost4j-spark-gpu (#6216 ) * [CI] Clean up build for JVM packages * Use correct path for saving native lib * Fix groupId of maven-surefire-plugin * Fix stashing of xgboost4j_jar_gpu * [CI] Don't run xgboost4j-tester with GPU, since it doesn't use gpu_hist	2020-10-09 14:08:15 -07:00
Christian Lorentzen	cf4f019ed6	[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183 ) * Change DefaultEvalMetric of classification from error to logloss * Change default binary metric in plugin/example/custom_obj.cc * Set old error metric in python tests * Set old error metric in R tests * Fix missed eval metrics and typos in R tests * Fix setting eval_metric twice in R tests * Add warning for empty eval_metric for classification * Fix Dask tests Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-02 12:06:47 -07:00
Nan Zhu	c932fb50a1	[jvm-packages]add xgboost4j-gpu/xgboost4j-spark-gpu module to facilitate release (#6136 ) * add xgboost4j-gpu/xgboost4j-spark-gpu module to facilitate release * Update pom.xml	2020-09-20 09:20:38 -07:00
Philip Hyunsu Cho	33577ef5d3	Add MAPE metric (#6119 )	2020-09-14 18:45:27 -07:00
Hristo Iliev	da61d9460b	[jvm-packages] Add getNumFeature method (#6075 ) * Add getNumFeature to the Java API * Add getNumFeature to the Scala API * Add unit tests for getNumFeature Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-09-07 20:57:46 -07:00
Bobby Wang	0e2d5669f6	[jvm-packages] cancel job instead of killing SparkContext (#6019 ) * cancel job instead of killing SparkContext This PR changes the default behavior that kills SparkContext. Instead, This PR cancels jobs when coming across task failed. That means the SparkContext is still alive even some exceptions happen. * add a parameter to control if killing SparkContext * cancel the jobs the failed task belongs to * remove the jobId from the map when one job failed. * resolve comments	2020-09-02 14:20:59 -07:00
Anthony D'Amato	ada964f16e	Clean the way deterministic paritioning is computed (#6033 ) We propose to only use the rowHashCode to compute the partitionKey, adding the FeatureValue hashCode does not bring more value and would make the computation slower. Even though a collision would appear at 0.2% with MurmurHash3 this is bearable for partitioning, this won't have any impact on the data balancing.	2020-08-30 14:38:23 -07:00
FelixYBW	3a990433f9	set maxBins to 256. Align with c code in src/tree/param.h (#6066 )	2020-08-28 15:06:11 +03:00
Philip Hyunsu Cho	9c14e430af	[CI] Improve JVM test in GitHub Actions (#5930 ) * [CI] Improve JVM test in GitHub Actions * Use env var for Wagon options [skip ci] * Move the retry flag to pom.xml [skip ci] * Export env var RABIT_MOCK to run Spark tests [skip ci] * Correct location of env var * Re-try up to 5 times [skip ci] * Don't run distributed training test on Windows * Fix typo * Update main.yml	2020-08-25 10:14:46 -07:00
Philip Hyunsu Cho	b3193052b3	Bump version to 1.3.0 snapshot in master (#6052 )	2020-08-23 17:13:46 -07:00

1 2 3 4 5 ...

387 Commits