xgboost

Author	SHA1	Message	Date
Hristo Iliev	da61d9460b	[jvm-packages] Add getNumFeature method (#6075 ) * Add getNumFeature to the Java API * Add getNumFeature to the Scala API * Add unit tests for getNumFeature Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2020-09-07 20:57:46 -07:00
Philip Hyunsu Cho	b3193052b3	Bump version to 1.3.0 snapshot in master (#6052 )	2020-08-23 17:13:46 -07:00
Philip Hyunsu Cho	3fcfaad577	Add CMake flag to log C API invocations, to aid debugging (#5925 ) * Add CMake flag to log C API invocations, to aid debugging * Remove unnecessary parentheses	2020-07-30 19:24:28 -07:00
Bobby Wang	8943eb4314	[BLOCKING] [jvm-packages] add gpu_hist and enable gpu scheduling (#5171 ) * [jvm-packages] add gpu_hist tree method * change updater hist to grow_quantile_histmaker * add gpu scheduling * pass correct parameters to xgboost library * remove debug info * add use.cuda for pom * add CI for gpu_hist for jvm * add gpu unit tests * use gpu node to build jvm * use nvidia-docker * Add CLI interface to create_jni.py using argparse Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2020-07-26 21:53:24 -07:00
Philip Hyunsu Cho	073b625bde	Bump version to 1.2.0 snapshot in master (#5733 )	2020-05-31 00:11:34 -07:00
Jiaming Yuan	dd9aeb60ae	[JVM Packages] Catch dmlc error by ref. (#5678 )	2020-05-19 13:00:12 +08:00
Liang-Chi Hsieh	397d8f0ee7	[jvm-packages] XGBoost Spark should deal with NaN when parsing evaluation output (#5546 )	2020-04-19 23:10:30 -07:00
Bobby Wang	ad826e913f	[jvm-packages]add feature size for LabelPoint and DataBatch (#5303 ) * fix type error * Validate number of features. * resolve comments * add feature size for LabelPoint and DataBatch * pass the feature size to native * move feature size validating tests into a separate suite * resolve comments Co-authored-by: fis <jm.yuan@outlook.com>	2020-04-07 16:49:52 -07:00
Jiaming Yuan	f2b8cd2922	Add number of columns to native data iterator. (#5202 ) * Change native data iter into an adapter.	2020-02-25 23:42:01 +08:00
Philip Hyunsu Cho	7ac7e8778f	Port patches from 1.0.0 branch (#5336 ) * Remove f-string, since it's not supported by Python 3.5 (#5330) * Remove f-string, since it's not supported by Python 3.5 * Add Python 3.5 to CI, to ensure compatibility * Remove duplicated matplotlib * Show deprecation notice for Python 3.5 * Fix lint * Fix lint * Fix a unit test that mistook MINOR ver for PATCH ver * Enforce only major version in JSON model schema * Bump version to 1.1.0-SNAPSHOT	2020-02-21 13:13:21 -08:00
Jiaming Yuan	9f77c18b0d	Add JVM_CHECK_CALL. (#5199 ) * Added a check call macro in jvm package, prevents executing other functions from jvm when error occurred in XGBoost. For example, when prediction fails jvm should not try to allocate memory based on the output prediction size.	2020-02-18 11:10:55 +08:00
Nan Zhu	d7b45fbcaf	[jvm-packages] do not use multiple jobs to make checkpoints (#5082 ) * temp * temp * tep * address the comments * fix stylistic issues * fix * external checkpoint	2020-02-01 19:36:39 -08:00
Kodi Arfer	f100b8d878	[Breaking] Don't drop trees during DART prediction by default (#5115 ) * Simplify DropTrees calling logic * Add `training` parameter for prediction method. * [Breaking]: Add `training` to C API. * Change for R and Python custom objective. * Correct comment. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2020-01-13 21:48:30 +08:00
Jiaming Yuan	7b65698187	Enforce correct data shape. (#5191 ) * Fix syncing DMatrix columns. * notes for tree method. * Enable feature validation for all interfaces except for jvm. * Better tests for boosting from predictions. * Disable validation on JVM.	2020-01-13 15:48:17 +08:00
Chen Qin	b29b8c2f34	[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4966 ) * [phase 1] expose sets of rabit configurations to spark layer * add back mutable import * disable ring_mincount till https://github.com/dmlc/rabit/pull/106d * Revert "disable ring_mincount till https://github.com/dmlc/rabit/pull/106d" This reverts commit 65e95a98e24f5eb53c6ba9ef9b2379524258984d. * apply latest rabit * fix build error * apply https://github.com/dmlc/xgboost/pull/4880 * downgrade cmake in rabit * point to rabit with DMLC_ROOT fix * relative path of rabit install prefix * split rabit parameters to another trait * misc * misc * Delete .classpath * Delete .classpath * Delete .classpath * Update XGBoostClassifier.scala * Update XGBoostRegressor.scala * Update GeneralParams.scala * Update GeneralParams.scala * Update GeneralParams.scala * Update GeneralParams.scala * Delete .classpath * Update RabitParams.scala * Update .gitignore * Update .gitignore * apply rabitParams to training * use string as rabit parameter value type * cleanup * add rabitEnv check * point to dmlc/rabit * per feedback * update private scope * misc * update rabit * add rabit_timtout, fix failing test. * split tests * allow build jvm with rabit mock * pass mock failures to rabit with test * add mock error and graceful handle rabit assertion error test * split mvn test * remove sign for test * update rabit * build jvm_packages with rabit mock * point back to dmlc/rabit * per feedback, update scala header * cleanup pom * per feedback * try fix lint * fix lint * per feedback, remove bootstrap_cache * per feedback 2 * try replace dev profile with passing mvn property * fix build error * remove mvn property and replace with env setting to build test jar * per feedback * revert copyright headlines, point to dmlc/rabit * revert python lint * remove multiple failure test case as retry is not enabled in spark * Update core.py * Update core.py * per feedback, style fix	2019-11-01 14:21:19 -07:00
Jiaming Yuan	010b8f1428	Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876 )" (#4965 ) This reverts commit 86ed01c4bbecef66e1bc4d02fb13116bd6130fae.	2019-10-18 14:02:35 -07:00
Chen Qin	86ed01c4bb	[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876 ) * Expose sets of rabit configurations to spark layer	2019-10-18 15:07:31 -04:00
Jiaming Yuan	31030a8d3a	Set correct file permission. (#4964 )	2019-10-18 12:54:29 -04:00
Honza Sterba	22209b7b95	[jvm-packages] Add BigDenseMatrix (#4383 ) * Add BigDenseMatrix * ability to create DMatrix with bigger than Integer.MAX_VALUE size arrays * uses sun.misc.Unsafe * make DMatrix test work from a jar as well	2019-09-18 20:46:14 -07:00
Jiaming Yuan	d669ea1eaa	Deprecate set group (#4864 ) * Convert jvm package and R package. * Restore for compatibility.	2019-09-17 21:26:54 -04:00
Stephanie Yang	0fc7dcfe6c	Add public group getter for java and scala (#4838 ) * Add public group getter for java and scala * Remove unnecessary param from javadoc * Fix typo * Fix another typo * Add semicolon * Fix javadoc return statement * Fix missing return statement * Add a unit test	2019-09-09 10:07:48 -07:00
Oleksandr Pryimak	b68de018b8	[jvm-packages] jvm test should clean up after themselfs (#4706 )	2019-08-04 14:09:11 -07:00
Nan Zhu	1595e3f57b	upgrade version num (#4670 ) * upgrade version num * missign changes * fix version script * change versions * rm files * Update CMakeLists.txt	2019-07-17 15:25:35 -07:00
koertkuipers	3c506b076e	[jvm-packages] upgrade to Scala 2.12 (#4574 ) * bump scala to 2.12 which requires java 8 and also newer flink and akka * put scala version in artifactId * fix appveyor * fix for scaladoc issue that looks like https://github.com/scala/bug/issues/10509 * fix ci_build * update versions in generate_pom.py * fix generate_pom.py * apache does not have a download for spark 2.4.3 distro using scala 2.12 yet, so for now i use a tgz i put on s3 * Upload spark-2.4.3-bin-scala2.12-hadoop2.7.tgz to our own S3 * Update Dockerfile.jvm_cross * Update Dockerfile.jvm_cross	2019-07-16 08:43:34 -07:00
Oleksandr Pryimak	2973416f2e	[jvm-packages] Fix maven warnings (#4664 ) * exec plugin was missing a version * reportPlugins has been deprecated: see https://maven.apache.org/plugins/maven-site-plugin/maven-3.html#Classic_configuration_Maven_2__3	2019-07-15 20:25:43 -07:00
Nan Zhu	abffbe014e	[jvm-packages] delete all constraints from spark layer about obj and eval metrics and handle error in jvm layer (#4560 ) * temp * prediction part * remove supported* * add for test * fix param name * add rabit * update rabit * return value of rabit init * eliminate compilation warnings * update rabit * shutdown * update rabit again * check sparkcontext shutdown * fix logic * sleep * fix tests * test with relaxed threshold * create new thread each time * stop for job quitting * udpate rabit * update rabit * update rabit * update git modules	2019-06-27 08:47:37 -07:00
Nan Zhu	fe2de6f415	[jvm-packages]fix silly bug in feature scoring (#4604 )	2019-06-25 20:49:01 -07:00
Bryan Woods	278562db13	Add support for cross-validation using query ID (#4474 ) * adding support for matrix slicing with query ID for cross-validation * hail mary test of unrar installation for windows tests * trying to modify tests to run in Github CI * Remove dependency on wget and unrar * Save error log from R test * Relax assertion in test_training * Use int instead of bool in C function interface * Revise R interface * Add XGDMatrixSliceDMatrixEx and keep old XGDMatrixSliceDMatrix for API compatibility	2019-05-23 10:45:02 -07:00
Philip Hyunsu Cho	515f5f5c47	[RFC] Version 0.90 release candidate (#4475 ) * Release 0.90 * Add script to automatically generate acknowledgment * Update NEWS.md	2019-05-20 01:02:44 -07:00
Nan Zhu	37dc82c3ff	[jvm-packages] allow partial evaluation of dataframe before prediction (#4407 ) * allow partial evaluation of dataframe before prediction * resume spark test * comments * Run unit tests after building JVM packages	2019-04-26 21:02:40 -07:00
Philip Hyunsu Cho	ea850ecd20	[CI] Refactor Jenkins CI pipeline + migrate all Linux tests to Jenkins (#4401 ) * All Linux tests are now in Jenkins CI * Tests are now de-coupled from builds. We can now build XGBoost with one version of CUDA/JDK and test it with another version of CUDA/JDK * Builds (compilation) are significantly faster because 1) They use C5 instances with faster CPU cores; and 2) build environment setup is cached using Docker containers	2019-04-26 18:39:12 -07:00
Nan Zhu	995698b0cb	[BREAKING][jvm-packages] fix the non-zero missing value handling (#4349 ) * fix the nan and non-zero missing value handling * fix nan handling part * add missing value * Update MissingValueHandlingSuite.scala * Update MissingValueHandlingSuite.scala * stylistic fix	2019-04-26 11:10:33 -07:00
Jiaming Yuan	207f058711	Refactor CMake scripts. (#4323 ) * Refactor CMake scripts. * Remove CMake CUDA wrapper. * Bump CMake version for CUDA. * Use CMake to handle Doxygen. * Split up CMakeList. * Export install target. * Use modern CMake. * Remove build.sh * Workaround for gpu_hist test. * Use cmake 3.12. * Revert machine.conf. * Move CLI test to gpu. * Small cleanup. * Support using XGBoost as submodule. * Fix windows * Fix cpp tests on Windows * Remove duplicated find_package.	2019-04-15 10:08:12 -07:00
Adam Pocock	a448a8320c	[jvm-packages] Fixing the NativeLibLoader on Java 9+ (#4351 ) The old NativeLibLoader had a short-circuit load path which modified java.library.path and attempted to load the xgboost library from outside the jar first, falling back to loading the library from inside the jar. This path is a no-op every time when using XGBoost outside of it's source tree. Additionally it triggers an illegal reflective access warning in the module system in 9, 10, and 11. On Java 12 the ClassLoader fields are not accessible via reflection (separately from the illegal reflective acces warning), and so it fails in a way that isn't caught by the code which falls back to loading the library from inside the jar. This commit removes that code path and always loads the xgboost library from inside the jar file as it's a valid technique across multiple JVM implementations and works with all versions of Java.	2019-04-10 12:41:44 -07:00
Xu Xiao	60a9af567c	[jvm-packages] Add methods operating attributes of booster in jvm package, which follow API design in python package. (#4336 )	2019-04-08 11:00:35 -07:00
Rong Ou	7ea5b772fb	do not filter shared library files (#4303 )	2019-03-28 19:40:54 +08:00
Harry Braviner	b374e0a7ab	[jvm-packages] Allow supression of Rabit output in Booster::train in xgboost4j (#4262 ) * Make train in xgboost4j respect print params Previously no setting in params argument of Booster::train would prevent the Rabit.trackerPrint call. This can fill up a lot of screen space in the case that many folds are being trained. * Setting "silent" in this map to "true", "True", a non-zero integer, or a string that can be parsed to such an int will prevent printing. * Setting "verbose_eval" to "False" or "false" will prevent printing. * Setting "verbose_eval" to an int (or a String parseable to an int) n will result in printing every n steps, or no printing is n is zero. This is to match the python behaviour described here: https://www.kaggle.com/c/rossmann-store-sales/discussion/17499 * Fixed 'slient' typo in xgboost4j test * private access on two methods	2019-03-21 18:25:12 +08:00
Nan Zhu	45c89a6792	[jvm-packages] logging version number (#4271 ) * print version number * add property file	2019-03-21 18:24:29 +08:00
Christopher Suchanek	ac3d03089b	[jvm-packages] remove shutdown of handler shutdown (#4224 )	2019-03-06 19:32:43 -08:00
Nan Zhu	5f34078fba	[jvm-packages] bump version for master (#4209 ) * update version * bump version	2019-03-04 23:12:24 -08:00
Yanbo Liang	9fefa2128d	[jvm-packages] Fix early stop with xgboost4j-spark (#4176 ) * Fix early stop with xgboost4j-spark * Update XGBoost.java * Update XGBoost.java * Update XGBoost.java To use -Float.MAX_VALUE as the lower bound, in case there is positive metric. * Only update best score if the current score is better (no update when equal) * Update xgboost-spark tutorial to fix early stopping docs.	2019-03-01 13:02:57 -08:00
Nan Zhu	1b7405f688	[jvm-packages] fix comments in objectiveTrait (#4174 )	2019-02-22 00:32:13 -08:00
Nan Zhu	c18a3660fa	Separate Depthwidth and Lossguide growing policy in fast histogram (#4102 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * more changes * temp * update * udpate rabit * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * broadcast subsampled feature correctly * init col * temp * col sampling * fix histmastrix init * fix col sampling * remove cout * fix out of bound access * fix core dump remove core dump file * disbale test temporarily * update * add fid * print perf data * update * revert some changes * temp * temp * pass all tests * bring back some tests * recover some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * recover column init part * more recovery * fix core dumps * code clean * revert some changes * fix test compilation issue * fix lint issue * resolve compilation issue * fix issues of lint caused by rebase * fix stylistic changes and change variable names * use regtree internal function * modularize depth width * address the comments * fix failed tests * wrap perf timers with class * fix lint * fix num_leaves count * fix indention * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.h Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.cc Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * Update src/tree/updater_quantile_hist.h Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com> * merge * fix compilation	2019-02-13 12:56:19 -08:00
Nan Zhu	ae3bb9c2d5	Distributed Fast Histogram Algorithm (#4011 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * init * allow hist algo * more changes * temp * update * remove hist sync * udpate rabit * change hist size * change the histogram * update kfactor * sync per node stats * temp * update * final * code clean * update rabit * more cleanup * fix errors * fix failed tests * enforce c++11 * fix lint issue * broadcast subsampled feature correctly * revert some changes * fix lint issue * enable monotone and interaction constraints * don't specify default for monotone and interactions * update docs	2019-02-05 05:12:53 -08:00
Shayak Banerjee	431c850c03	[jvm-packages] Updates to Java Booster to support other feature importance measures (#3801 ) * Updates to Booster to support other feature importances * Add returns for Java methods * Pass Scala style checks * Pass Java style checks * Fix indents * Use class instead of enum * Return map string double * A no longer broken build, thanks to mvn package local build * Add a unit test to increase code coverage back * Address code review on main code * Add more unit tests for different feature importance scores * Address more CR	2019-01-02 01:13:14 -08:00
Nan Zhu	c055a32609	[jvm-packages]support multiple validation datasets in Spark (#3910 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * wrap iterators * enable copartition training and validationset * add parameters * converge code path and have init unit test * enable multi evals for ranking * unit test and doc * update example * fix early stopping * address the offline comments * udpate doc * test eval metrics * fix compilation issue * fix example	2018-12-17 21:03:57 -08:00
Nan Zhu	9c4ff50e83	[jvm-packages]Fix early stopping condition (#3928 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * update version * 0.82 * fix early stopping condition * remove unused * update comments * udpate comments * update test	2018-11-24 00:18:07 -08:00
Nan Zhu	dc2bfbfde1	[jvm-packages] update version to 0.82-SNAPSHOT (#3920 ) * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * add back train method but mark as deprecated * fix scalastyle error * fix scalastyle error * fix scalastyle error * fix scalastyle error * update version * 0.82	2018-11-18 16:47:48 -08:00
Philip Hyunsu Cho	78ec77fa97	Release 0.81 version (#3864 ) * Release 0.81 version * Update NEWS.md	2018-11-04 05:49:11 -08:00
Matthew Tovbin	d81fedb955	[jvm-packages] RabitTracker for Scala: allow specifying host ip from the xgboost-tracker.properties file (#3833 )	2018-10-26 22:01:36 -07:00

1 2 3

133 Commits