xgboost

Author	SHA1	Message	Date
dependabot[bot]	b51a717deb	Bump junit from 4.11 to 4.13.1 in /jvm-packages/xgboost4j-gpu (#6233 ) Bumps [junit](https://github.com/junit-team/junit4) from 4.11 to 4.13.1. - [Release notes](https://github.com/junit-team/junit4/releases) - [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.11.md) - [Commits](https://github.com/junit-team/junit4/compare/r4.11...r4.13.1) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2020-10-13 19:44:56 -07:00
Philip Hyunsu Cho	c991eb612d	[jvm-packages] Fix up build for xgboost4j-gpu, xgboost4j-spark-gpu (#6216 ) * [CI] Clean up build for JVM packages * Use correct path for saving native lib * Fix groupId of maven-surefire-plugin * Fix stashing of xgboost4j_jar_gpu * [CI] Don't run xgboost4j-tester with GPU, since it doesn't use gpu_hist	2020-10-09 14:08:15 -07:00
Christian Lorentzen	cf4f019ed6	[Breaking] Change default evaluation metric for classification to logloss / mlogloss (#6183 ) * Change DefaultEvalMetric of classification from error to logloss * Change default binary metric in plugin/example/custom_obj.cc * Set old error metric in python tests * Set old error metric in R tests * Fix missed eval metrics and typos in R tests * Fix setting eval_metric twice in R tests * Add warning for empty eval_metric for classification * Fix Dask tests Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-10-02 12:06:47 -07:00
Nan Zhu	c932fb50a1	[jvm-packages]add xgboost4j-gpu/xgboost4j-spark-gpu module to facilitate release (#6136 ) * add xgboost4j-gpu/xgboost4j-spark-gpu module to facilitate release * Update pom.xml	2020-09-20 09:20:38 -07:00
Philip Hyunsu Cho	33577ef5d3	Add MAPE metric (#6119 )	2020-09-14 18:45:27 -07:00
Hristo Iliev	da61d9460b	[jvm-packages] Add getNumFeature method (#6075 ) * Add getNumFeature to the Java API * Add getNumFeature to the Scala API * Add unit tests for getNumFeature Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-09-07 20:57:46 -07:00
Bobby Wang	0e2d5669f6	[jvm-packages] cancel job instead of killing SparkContext (#6019 ) * cancel job instead of killing SparkContext This PR changes the default behavior that kills SparkContext. Instead, This PR cancels jobs when coming across task failed. That means the SparkContext is still alive even some exceptions happen. * add a parameter to control if killing SparkContext * cancel the jobs the failed task belongs to * remove the jobId from the map when one job failed. * resolve comments	2020-09-02 14:20:59 -07:00
Anthony D'Amato	ada964f16e	Clean the way deterministic paritioning is computed (#6033 ) We propose to only use the rowHashCode to compute the partitionKey, adding the FeatureValue hashCode does not bring more value and would make the computation slower. Even though a collision would appear at 0.2% with MurmurHash3 this is bearable for partitioning, this won't have any impact on the data balancing.	2020-08-30 14:38:23 -07:00
FelixYBW	3a990433f9	set maxBins to 256. Align with c code in src/tree/param.h (#6066 )	2020-08-28 15:06:11 +03:00
Philip Hyunsu Cho	9c14e430af	[CI] Improve JVM test in GitHub Actions (#5930 ) * [CI] Improve JVM test in GitHub Actions * Use env var for Wagon options [skip ci] * Move the retry flag to pom.xml [skip ci] * Export env var RABIT_MOCK to run Spark tests [skip ci] * Correct location of env var * Re-try up to 5 times [skip ci] * Don't run distributed training test on Windows * Fix typo * Update main.yml	2020-08-25 10:14:46 -07:00
Philip Hyunsu Cho	b3193052b3	Bump version to 1.3.0 snapshot in master (#6052 )	2020-08-23 17:13:46 -07:00
Philip Hyunsu Cho	4729458a36	[jvm-packages] [doc] Update install doc for JVM packages (#6051 )	2020-08-23 14:14:53 -07:00
Anthony D'Amato	f58e41bad8	Fix deterministic partitioning with dataset containing Double.NaN (#5996 ) The functions featureValueOfSparseVector or featureValueOfDenseVector could return a Float.NaN if the input vectore was containing any missing values. This would make fail the partition key computation and most of the vectors would end up in the same partition. We fix this by avoid returning a NaN and simply use the row HashCode in this case. We added a test to ensure that the repartition is indeed now uniform on input dataset containing values by checking that the partitions size variance is below a certain threshold. Signed-off-by: Anthony D'Amato <anthony.damato@hotmail.fr>	2020-08-18 18:55:37 -07:00
Jiaming Yuan	f93f1c03fc	Rabit update. (#5978 ) * Remove parameter on JVM Packages.	2020-08-11 09:17:32 +08:00
Shaochen Shi	71197d1dfa	[jvm-packages] Fix wrong method name `setAllowZeroForMissingValue`. (#5740 ) * Allow non-zero for missing value when training. * Fix wrong method names. * Add a unit test * Move the getter/setter unit test to MissingValueHandlingSuite Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-08-01 17:16:42 -07:00
Philip Hyunsu Cho	3fcfaad577	Add CMake flag to log C API invocations, to aid debugging (#5925 ) * Add CMake flag to log C API invocations, to aid debugging * Remove unnecessary parentheses	2020-07-30 19:24:28 -07:00
Jiaming Yuan	75b8c22b0b	Fix prediction heuristic (#5955 ) * Relax check for prediction. * Relax test in spark test. * Add tests in C++.	2020-07-29 19:24:07 +08:00
Bobby Wang	8943eb4314	[BLOCKING] [jvm-packages] add gpu_hist and enable gpu scheduling (#5171 ) * [jvm-packages] add gpu_hist tree method * change updater hist to grow_quantile_histmaker * add gpu scheduling * pass correct parameters to xgboost library * remove debug info * add use.cuda for pom * add CI for gpu_hist for jvm * add gpu unit tests * use gpu node to build jvm * use nvidia-docker * Add CLI interface to create_jni.py using argparse Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-07-26 21:53:24 -07:00
Philip Hyunsu Cho	487ab0ce73	[BLOCKING] Handle empty rows in data iterators correctly (#5929 ) * [jvm-packages] Handle empty rows in data iterators correctly * Fix clang-tidy error * last empty row * Add comments [skip ci] Co-authored-by: Nan Zhu <nanzhu@uber.com>	2020-07-25 13:46:19 -07:00
Philip Hyunsu Cho	627cf41a60	Add option to enable all compiler warnings in GCC/Clang (#5897 ) * Add option to enable all compiler warnings in GCC/Clang * Fix -Wall for CUDA sources * Make -Wall private req for xgboost-r	2020-07-21 23:34:03 -07:00
Jiaming Yuan	6c0c87216f	Fix Windows 2016 build. (#5902 )	2020-07-18 05:50:17 +08:00
Bobby Wang	9f85e92602	[jvm-packages] update spark dependency to 3.0.0 (#5836 )	2020-07-12 20:58:30 -07:00
Zhang Zhang	1813804e36	Add new parameter singlePrecisionHistogram to xgboost4j-spark (#5811 ) Expose the existing 'singlePrecisionHistogram' param to the Spark layer.	2020-07-08 16:29:35 -07:00
Philip Hyunsu Cho	0d411b0397	[CI] Simplify CMake build with modern CMake techniques (#5871 ) * [CI] Simplify CMake build * Make sure that plugins can be built * [CI] Install lz4 on Mac	2020-07-08 04:23:24 -07:00
anttisaukko	1bcbe1fc14	Bump com.esotericsoftware to 4.0.2 (#5690 ) Co-authored-by: Antti Saukko <antti.saukko@verizonmedia.com>	2020-06-13 21:06:14 -07:00
Philip Hyunsu Cho	073b625bde	Bump version to 1.2.0 snapshot in master (#5733 )	2020-05-31 00:11:34 -07:00
Andy Adinets	646def51e0	C++14 for xgboost (#5664 )	2020-05-21 12:26:40 +12:00
Jiaming Yuan	dd9aeb60ae	[JVM Packages] Catch dmlc error by ref. (#5678 )	2020-05-19 13:00:12 +08:00
Liang-Chi Hsieh	397d8f0ee7	[jvm-packages] XGBoost Spark should deal with NaN when parsing evaluation output (#5546 )	2020-04-19 23:10:30 -07:00
Philip Hyunsu Cho	1b1969f20d	[jvm-packages] [CI] Create a Maven repository to host SNAPSHOT JARs (#5533 )	2020-04-14 19:33:32 -07:00
Liang-Chi Hsieh	449ab79e0c	[CI] Use devtoolset-6 because devtoolset-4 is EOL and no longer available (#5506 ) * Use devtoolset-6. * [CI] Use devtoolset-6 because devtoolset-4 is EOL and no longer available * CUDA 9.0 doesn't work with devtoolset-6; use devtoolset-4 for GPU build only Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-04-11 19:49:06 -07:00
Bobby Wang	ad826e913f	[jvm-packages]add feature size for LabelPoint and DataBatch (#5303 ) * fix type error * Validate number of features. * resolve comments * add feature size for LabelPoint and DataBatch * pass the feature size to native * move feature size validating tests into a separate suite * resolve comments Co-authored-by: fis <jm.yuan@outlook.com>	2020-04-07 16:49:52 -07:00
Jiaming Yuan	f2b8cd2922	Add number of columns to native data iterator. (#5202 ) * Change native data iter into an adapter.	2020-02-25 23:42:01 +08:00
Philip Hyunsu Cho	7ac7e8778f	Port patches from 1.0.0 branch (#5336 ) * Remove f-string, since it's not supported by Python 3.5 (#5330) * Remove f-string, since it's not supported by Python 3.5 * Add Python 3.5 to CI, to ensure compatibility * Remove duplicated matplotlib * Show deprecation notice for Python 3.5 * Fix lint * Fix lint * Fix a unit test that mistook MINOR ver for PATCH ver * Enforce only major version in JSON model schema * Bump version to 1.1.0-SNAPSHOT	2020-02-21 13:13:21 -08:00
Jiaming Yuan	9f77c18b0d	Add JVM_CHECK_CALL. (#5199 ) * Added a check call macro in jvm package, prevents executing other functions from jvm when error occurred in XGBoost. For example, when prediction fails jvm should not try to allocate memory based on the output prediction size.	2020-02-18 11:10:55 +08:00
Nan Zhu	d7b45fbcaf	[jvm-packages] do not use multiple jobs to make checkpoints (#5082 ) * temp * temp * tep * address the comments * fix stylistic issues * fix * external checkpoint	2020-02-01 19:36:39 -08:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Philip Hyunsu Cho	74f545bde3	[CI] Repair download URL for Maven 3.6.1 (#5139 )	2019-12-20 10:07:40 +08:00
Philip Hyunsu Cho	37fdfa03f8	[jvm-packages] Comply with scala style convention + fix broken unit test (#5134 ) * Fix scala style check * fix messed unit test	2019-12-18 17:26:58 -08:00
cpfarrell	bc9d88259f	[jvm-packages] Allow for bypassing spark missing value check (#4805 ) * Allow for bypassing spark missing value check * Update documentation for dealing with missing values in spark xgboost	2019-12-18 10:48:20 -08:00
Chen Qin	b29b8c2f34	[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4966 ) * [phase 1] expose sets of rabit configurations to spark layer * add back mutable import * disable ring_mincount till https://github.com/dmlc/rabit/pull/106d * Revert "disable ring_mincount till https://github.com/dmlc/rabit/pull/106d" This reverts commit 65e95a98e24f5eb53c6ba9ef9b2379524258984d. * apply latest rabit * fix build error * apply https://github.com/dmlc/xgboost/pull/4880 * downgrade cmake in rabit * point to rabit with DMLC_ROOT fix * relative path of rabit install prefix * split rabit parameters to another trait * misc * misc * Delete .classpath * Delete .classpath * Delete .classpath * Update XGBoostClassifier.scala * Update XGBoostRegressor.scala * Update GeneralParams.scala * Update GeneralParams.scala * Update GeneralParams.scala * Update GeneralParams.scala * Delete .classpath * Update RabitParams.scala * Update .gitignore * Update .gitignore * apply rabitParams to training * use string as rabit parameter value type * cleanup * add rabitEnv check * point to dmlc/rabit * per feedback * update private scope * misc * update rabit * add rabit_timtout, fix failing test. * split tests * allow build jvm with rabit mock * pass mock failures to rabit with test * add mock error and graceful handle rabit assertion error test * split mvn test * remove sign for test * update rabit * build jvm_packages with rabit mock * point back to dmlc/rabit * per feedback, update scala header * cleanup pom * per feedback * try fix lint * fix lint * per feedback, remove bootstrap_cache * per feedback 2 * try replace dev profile with passing mvn property * fix build error * remove mvn property and replace with env setting to build test jar * per feedback * revert copyright headlines, point to dmlc/rabit * revert python lint * remove multiple failure test case as retry is not enabled in spark * Update core.py * Update core.py * per feedback, style fix	2019-11-01 14:21:19 -07:00
Jiaming Yuan	010b8f1428	Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876 )" (#4965 ) This reverts commit `86ed01c4bb`.	2019-10-18 14:02:35 -07:00
Chen Qin	86ed01c4bb	[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876 ) * Expose sets of rabit configurations to spark layer	2019-10-18 15:07:31 -04:00
Jiaming Yuan	31030a8d3a	Set correct file permission. (#4964 )	2019-10-18 12:54:29 -04:00
Liangcai Li	82ee2317e8	Add case for LongParam. (#4885 ) To support specifying long parameter as String, the same as other basic type, such as Int, Double ...	2019-09-25 05:41:53 -07:00
Nan Zhu	fc8c9b0521	[jvm-packages] enable deterministic repartitioning when checkpoint is enabled (#4807 ) * do reparititoning in DataUtil * keep previous behavior of partitioning without checkpoint * deterministic repartitioning * change	2019-09-19 15:21:05 -07:00
Xu Xiao	277e25797b	[jvm-packages] refine numAliveCores method of SparkParallelismTracker (#4858 ) * refine numAliveCores * refine XGBoostToMLlibParams * fix waitForCondition * resolve conflicts * Update SparkParallelismTracker.scala	2019-09-19 15:18:29 -07:00
Honza Sterba	22209b7b95	[jvm-packages] Add BigDenseMatrix (#4383 ) * Add BigDenseMatrix * ability to create DMatrix with bigger than Integer.MAX_VALUE size arrays * uses sun.misc.Unsafe * make DMatrix test work from a jar as well	2019-09-18 20:46:14 -07:00
Jiaming Yuan	d669ea1eaa	Deprecate set group (#4864 ) * Convert jvm package and R package. * Restore for compatibility.	2019-09-17 21:26:54 -04:00

1 2 3 4 5 ...

348 Commits