xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	bfddc2c42c	Make CMakeLists.txt compatible with CMake 3.3 (#4420 ) * Make CMakeLists.txt compatible with CMake 3.3; require CMake 3.11 for MSVC * Use CMake 3.12 when sanitizer is enabled * Disable funroll-loops for MSVC * Use cmake version in container name * Add missing arg * Fix egrep use in ci_build.sh * Display CMake version * Do not set OpenMP_CXX_LIBRARIES for MSVC * Use cmake_minimum_required()	2019-05-02 11:49:32 +08:00
Philip Hyunsu Cho	17df5fd296	Simplify bound checking in feature interaction constraints (#4428 )	2019-05-01 16:59:53 -07:00
Xu Xiao	4c74336384	Use feature interaction constraints to narrow search space for split candidates (#4341 ) * Use feature interaction constraints to narrow search space for split candidates. * fix clang-tidy broken at updater_quantile_hist.cc:535:3 * make const * fix * try to fix exception thrown in java_test * fix suspected mistake which cause EvaluateSplit error * try fix * Fix bug: feature ID and node ID swapped in argument * Rename CheckValidation() to CheckFeatureConstraint() for clarity * Do not create temporary vector validFeatures, to enable parallelism	2019-04-30 20:59:58 -07:00
Philip Hyunsu Cho	ba98e0cdf2	Add additional Python tests to test training under constraints (#4426 )	2019-04-30 18:23:39 -07:00
Rong Ou	eaab364a63	More explict sharding methods for device memory (#4396 ) * Rename the Reshard method to Shard * Add a new Reshard method for sharding a vector that's already sharded	2019-05-01 11:47:22 +12:00
Xu Xiao	797ba8e72d	[jvm-packages] fix compatibility problem of spark version (#4411 ) * fix compatibility problem of spark version on MissingValueHandlingSuite.scala * call setHandleInvalid by runtime reflection	2019-04-30 09:13:05 -07:00
Nan Zhu	253fdd8a42	[jvm-packages] fix the split of input (#4417 )	2019-04-29 18:52:40 -07:00
tqchen	91c513a0c1	fix doc	2019-04-29 17:50:46 -07:00
Rory Mitchell	5e582b0fa7	Combine thread launches into single launch per tree for gpu_hist (#4343 ) * Combine thread launches into single launch per tree for gpu_hist algorithm. * Address deprecation warning * Add manual column sampler constructor * Turn off omp dynamic to get a guaranteed number of threads * Enable openmp in cuda code	2019-04-29 09:58:34 +12:00
Ravi Kalia	146e83f3b3	Fix typo in model.rst (#4393 )	2019-04-27 14:22:07 -07:00
Bozhao	5dfb27fb2d	Update demo readme's use case section with BentoML (#4400 )	2019-04-27 14:21:17 -07:00
Jiaming Yuan	77c03538b0	Fix node reuse. (#4404 ) * Reinitialize `_sindex` when reallocating a deleted node.	2019-04-27 13:03:23 +08:00
Nan Zhu	37dc82c3ff	[jvm-packages] allow partial evaluation of dataframe before prediction (#4407 ) * allow partial evaluation of dataframe before prediction * resume spark test * comments * Run unit tests after building JVM packages	2019-04-26 21:02:40 -07:00
Philip Hyunsu Cho	ea850ecd20	[CI] Refactor Jenkins CI pipeline + migrate all Linux tests to Jenkins (#4401 ) * All Linux tests are now in Jenkins CI * Tests are now de-coupled from builds. We can now build XGBoost with one version of CUDA/JDK and test it with another version of CUDA/JDK * Builds (compilation) are significantly faster because 1) They use C5 instances with faster CPU cores; and 2) build environment setup is cached using Docker containers	2019-04-26 18:39:12 -07:00
Nan Zhu	995698b0cb	[BREAKING][jvm-packages] fix the non-zero missing value handling (#4349 ) * fix the nan and non-zero missing value handling * fix nan handling part * add missing value * Update MissingValueHandlingSuite.scala * Update MissingValueHandlingSuite.scala * stylistic fix	2019-04-26 11:10:33 -07:00
Xu Xiao	2d875ec019	[BLOCKING][jvm-packages] fix non-deterministic order within a partition (in the case of an upstream shuffle) on prediction (#4388 ) * [jvm-packages][hot-fix] fix column mismatch caused by zip actions at XGBooostModel.transformInternal * apply minibatch in prediction * an iterator-compatible minibatch prediction * regressor impl * continuous working on mini-batch prediction of xgboost4j-spark * Update Booster.java	2019-04-26 11:09:20 -07:00
Philip Hyunsu Cho	503cc42f48	[CI] Fix Windows tests (#4403 ) * Install binary igraph * Include Graphviz in PATH	2019-04-25 20:25:43 -07:00
Rong Ou	2c61f02add	fix broken python test (#4395 )	2019-04-23 16:01:23 -07:00
Philip Hyunsu Cho	bbe0dbd7ec	Migrate pylint check to Python 3 (#4381 ) * Migrate lint to Python 3 * Fix lint errors * Use Miniconda3 to use Python 3.7 * Use latest pylint and astroid	2019-04-21 01:01:54 -07:00
James Lamb	5e97de6a41	fixed typos in R package docs (#4345 ) * fixed typos in R package docs * updated verbosity parameter in xgb.train docs	2019-04-21 15:54:11 +08:00
Nan Zhu	65db8d0626	[jvm-packages] support spark 2.4 and compatibility test with previous xgboost version (#4377 ) * bump spark version * keep float.nan * handle brokenly changed name/value * add test * add model files * add model files * update doc	2019-04-17 11:33:13 -07:00
Egor Smirnov	711397d645	Optimizations of pre-processing for 'hist' tree method (#4310 ) * oprimizations for pre-processing * code cleaning * code cleaning * code cleaning after review * Apply suggestions from code review Co-Authored-By: SmirnovEgorRu <egor.smirnov@intel.com>	2019-04-16 17:36:19 -07:00
Jiaming Yuan	207f058711	Refactor CMake scripts. (#4323 ) * Refactor CMake scripts. * Remove CMake CUDA wrapper. * Bump CMake version for CUDA. * Use CMake to handle Doxygen. * Split up CMakeList. * Export install target. * Use modern CMake. * Remove build.sh * Workaround for gpu_hist test. * Use cmake 3.12. * Revert machine.conf. * Move CLI test to gpu. * Small cleanup. * Support using XGBoost as submodule. * Fix windows * Fix cpp tests on Windows * Remove duplicated find_package.	2019-04-15 10:08:12 -07:00
Jiaming Yuan	84d992babc	GPU multiclass metrics (#4368 ) * Port multi classes metrics to CUDA.	2019-04-15 17:47:47 +08:00
James Lamb	be7bc07ca3	added files from local R build to gitignore (#4346 )	2019-04-13 03:02:02 +08:00
James Lamb	edae664afb	[r-package] cut CI-time dependency on craigcitro/r-travis (fixes #4348 ) (#4353 ) * [r-package] cut CI-time dependency on craigcitro/r-travis (fixes #4348) * Install R * Install R on OSX * Remove gfortran symlink * Specify CRAN repo * added more R dependencies needed for testing * removed heavy R dependencies in CI * fixed bug in env var, removed unnecessary apt installs of R * fix to R installs	2019-04-12 00:22:48 -07:00
Rong Ou	f4521bf6aa	refactor tests to get rid of duplication (#4358 ) * refactor tests to get rid of duplication * address review comments	2019-04-12 00:21:48 -07:00
Xu Xiao	3078b5944d	add OpenMP option in CMakeLists.txt (#4339 )	2019-04-10 17:35:06 -07:00
Adam Pocock	a448a8320c	[jvm-packages] Fixing the NativeLibLoader on Java 9+ (#4351 ) The old NativeLibLoader had a short-circuit load path which modified java.library.path and attempted to load the xgboost library from outside the jar first, falling back to loading the library from inside the jar. This path is a no-op every time when using XGBoost outside of it's source tree. Additionally it triggers an illegal reflective access warning in the module system in 9, 10, and 11. On Java 12 the ClassLoader fields are not accessible via reflection (separately from the illegal reflective acces warning), and so it fails in a way that isn't caught by the code which falls back to loading the library from inside the jar. This commit removes that code path and always loads the xgboost library from inside the jar file as it's a valid technique across multiple JVM implementations and works with all versions of Java.	2019-04-10 12:41:44 -07:00
Jean-Francois Zinque	956e73f183	Fix matrix attributes not sliced (#4311 )	2019-04-10 11:14:44 -07:00
Jiaming Yuan	5c2575535f	Fix Histogram allocation. (#4347 ) * Fix Histogram allocation. nidx_map is cleared after `Reset`, but histogram data size isn't changed hence histogram recycling is used in later iterations. After a reset(building new tree), newly allocated node will start from 0, while recycling always choose the node with smallest index, which happens to be our newly allocated node 0.	2019-04-10 19:21:26 +08:00
Rong Ou	81c1cd40ca	add a test for cpu predictor using external memory (#4308 ) * add a test for cpu predictor using external memory * allow different page size for testing	2019-04-10 13:25:10 +12:00
James Lamb	b72eab3e07	Added travis logo (#4344 )	2019-04-08 21:20:15 -07:00
Mayank Suman	360f25ec27	Added language classifier for python (#4327 ) * Added language classifier for python * Removed python2 language classifier * Fix formatting	2019-04-08 11:13:26 -07:00
Yang Yang	c7bc739ed2	Fix document about colsample_by* parameter (#4340 ) Correct the calculation mistake in colsample_by* example.	2019-04-08 11:10:04 -07:00
Xu Xiao	60a9af567c	[jvm-packages] Add methods operating attributes of booster in jvm package, which follow API design in python package. (#4336 )	2019-04-08 11:00:35 -07:00
Andy Adinets	9080bba815	C API example (#4333 )	2019-04-08 11:22:03 +12:00
Jiaxiang Li	2e052e74b6	Update CONTRIBUTORS.md (#4335 )	2019-04-05 10:45:23 -07:00
Jiaxiang Li	1ca5698221	Make the train and test input with same colnames. (#4329 ) Fix the bug report of https://github.com/dmlc/xgboost/issues/4328. I am the beginner of the Git so just try my best to follows the guide, https://xgboost.readthedocs.io/en/latest/contribute.html#r-package. I find there is no `dev` branch, so I pull this fix from my master branch to the original master branch.	2019-04-04 15:59:27 -07:00
Philip Hyunsu Cho	70be1e38c2	[CI] Optimize external Docker build cache (#4334 ) * When building pull requests, use Docker cache for master branch Docker build caches are per-branch, so new pull requests will initially have no build cache, causing the Docker containers to be built from scratch. New pull requests should use the cache associated with the master branch. This makes sense, since most pull requests do not modify the Dockerfile. * Add comments	2019-04-04 15:59:07 -07:00
Philip Hyunsu Cho	37c75aac41	[CI] Add external Docker build cache (#4331 )	2019-04-04 13:36:39 -07:00
Jiaming Yuan	82dca3c108	Don't store DMatrix handle until it's initialized. (#4317 ) * Use a temporary variable to store the handle. * Decode c++ error message. * Simple note about saved binary.	2019-04-01 18:29:28 +08:00
sriramch	2f7087eba1	Improve HostDeviceVector exception safety (#4301 ) * make the assignments of HostDeviceVector exception safe. * storing a dummy GPUDistribution instance in HDV for CPU based code. * change testxgboost binary location to build directory.	2019-03-31 22:48:58 +08:00
Hajime Morrita	680a1b36f3	Get rid of a few trivial compiler warnings. (#4312 )	2019-03-31 00:02:29 +08:00
Nan Zhu	ad4de0d718	[jvm-packages] handle NaN as missing value explicitly (#4309 ) * handle nan * handle nan explicitly * make code better and handle sparse vector in spark * Update XGBoostGeneralSuite.scala	2019-03-30 19:34:26 +08:00
Rong Ou	7ea5b772fb	do not filter shared library files (#4303 )	2019-03-28 19:40:54 +08:00
Philip Hyunsu Cho	7aed8f3d48	[CI] Upgrade to GCC 5.3.1, CMake 3.6.0 (#4306 ) * Upgrade to GCC 5.3.1, CMake 3.6.0 * <regex> is now okay	2019-03-28 00:21:21 -07:00
Rong Ou	8c8021dfa7	use all cores to build on linux (#4304 )	2019-03-27 19:51:08 -07:00
Rory Mitchell	3f312e30db	Retire DVec class in favour of c++20 style span for device memory. (#4293 )	2019-03-28 13:59:58 +13:00
Jiaming Yuan	c85181dd8a	Remove remaining `silent` and `debug_verbose`. (#4299 )	2019-03-28 03:30:46 +08:00

1 2 3 4 5 ...

3710 Commits