xgboost

Author	SHA1	Message	Date
KaiJin Ji	1733c9e8f7	Improve operation efficiency for single predict (#5016 ) * Improve operation efficiency for single predict	2019-11-10 02:01:28 +08:00
mitama	374648c21a	Add better error message for invalid feature names (#5024 )	2019-11-10 01:58:14 +08:00
Jiaming Yuan	7663de956c	Run training with empty DMatrix. (#4990 ) This makes GPU Hist robust in distributed environment as some workers might not be associated with any data in either training or evaluation. * Disable rabit mock test for now: See #5012 . * Disable dask-cudf test at prediction for now: See #5003 * Launch dask job for all workers despite they might not have any data. * Check 0 rows in elementwise evaluation metrics. Using AUC and AUC-PR still throws an error. See #4663 for a robust fix. * Add tests for edge cases. * Add `LaunchKernel` wrapper handling zero sized grid. * Move some parts of allreducer into a cu file. * Don't validate feature names when the booster is empty. * Sync number of columns in DMatrix. As num_feature is required to be the same across all workers in data split mode. * Filtering in dask interface now by default syncs all booster that's not empty, instead of using rank 0. * Fix Jenkins' GPU tests. * Install dask-cuda from source in Jenkins' test. Now all tests are actually running. * Restore GPU Hist tree synchronization test. * Check UUID of running devices. The check is only performed on CUDA version >= 10.x, as 9.x doesn't have UUID field. * Fix CMake policy and project variables. Use xgboost_SOURCE_DIR uniformly, add policy for CMake >= 3.13. * Fix copying data to CPU * Fix race condition in cpu predictor. * Fix duplicated DMatrix construction. * Don't download extra nccl in CI script.	2019-11-06 16:13:13 +08:00
Christopher Cowden	807a244517	Fix repeated split and 0 cover nodes (#5010 )	2019-11-06 14:57:22 +08:00
Chen Qin	b29b8c2f34	[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4966 ) * [phase 1] expose sets of rabit configurations to spark layer * add back mutable import * disable ring_mincount till https://github.com/dmlc/rabit/pull/106d * Revert "disable ring_mincount till https://github.com/dmlc/rabit/pull/106d" This reverts commit 65e95a98e24f5eb53c6ba9ef9b2379524258984d. * apply latest rabit * fix build error * apply https://github.com/dmlc/xgboost/pull/4880 * downgrade cmake in rabit * point to rabit with DMLC_ROOT fix * relative path of rabit install prefix * split rabit parameters to another trait * misc * misc * Delete .classpath * Delete .classpath * Delete .classpath * Update XGBoostClassifier.scala * Update XGBoostRegressor.scala * Update GeneralParams.scala * Update GeneralParams.scala * Update GeneralParams.scala * Update GeneralParams.scala * Delete .classpath * Update RabitParams.scala * Update .gitignore * Update .gitignore * apply rabitParams to training * use string as rabit parameter value type * cleanup * add rabitEnv check * point to dmlc/rabit * per feedback * update private scope * misc * update rabit * add rabit_timtout, fix failing test. * split tests * allow build jvm with rabit mock * pass mock failures to rabit with test * add mock error and graceful handle rabit assertion error test * split mvn test * remove sign for test * update rabit * build jvm_packages with rabit mock * point back to dmlc/rabit * per feedback, update scala header * cleanup pom * per feedback * try fix lint * fix lint * per feedback, remove bootstrap_cache * per feedback 2 * try replace dev profile with passing mvn property * fix build error * remove mvn property and replace with env setting to build test jar * per feedback * revert copyright headlines, point to dmlc/rabit * revert python lint * remove multiple failure test case as retry is not enabled in spark * Update core.py * Update core.py * per feedback, style fix	2019-11-01 14:21:19 -07:00
Philip Hyunsu Cho	a37691428f	Document minimum version required for gtest [skip ci] (#5001 )	2019-10-31 15:47:50 -07:00
Jiaming Yuan	6fac40cfb4	Add asan.so.5 to cmake script. (#4999 )	2019-10-30 16:03:23 -04:00
Jiaming Yuan	755a606201	Fix dart usegpu. (#4984 )	2019-10-28 06:12:04 -04:00
Jiaming Yuan	6ec7e300bd	Fix external memory race in colmaker. (#4980 ) * Move `GetColDensity` out of omp parallel block.	2019-10-25 04:11:13 -04:00
Philip Hyunsu Cho	96cd7ec2bb	[CI] Upload master branch artifacts to S3 root [skip ci] (#4979 )	2019-10-23 22:39:04 -07:00
Philip Hyunsu Cho	da6e74f7bb	[CI] Upload nightly builds to S3 (#4976 ) * Do not store built artifacts in the Jenkins master * Add wheel renaming script * Upload wheels to S3 bucket * Use env.GIT_COMMIT * Capture git hash correctly * Add missing import in Jenkinsfile * Address reviewer's comments * Put artifacts for pull requests in separate directory * No wildcard expansion in Windows CMD	2019-10-23 21:16:05 -07:00
Jiaming Yuan	ac457c56a2	Use `UpdateAllowUnknown' for non-model related parameter. (#4961 ) * Use `UpdateAllowUnknown' for non-model related parameter. Model parameter can not pack an additional boolean value due to binary IO format. This commit deals only with non-model related parameter configuration. * Add tidy command line arg for use-dmlc-gtest.	2019-10-23 05:50:12 -04:00
Jiaming Yuan	f24be2efb4	Use configure_file() to configure version only (#4974 ) * Avoid writing build_config.h * Remove build_config.h all together. * Lint.	2019-10-22 23:47:00 -07:00
Rong Ou	5b1715d97c	Write ELLPACK pages to disk (#4879 ) * add ellpack source * add batch param * extract function to parse cache info * construct ellpack info separately * push batch to ellpack page * write ellpack page. * make sparse page source reusable	2019-10-22 23:44:32 -04:00
sriramch	310fe60b35	Pairwise ranking objective implementation on gpu (#4873 ) * - pairwise ranking objective implementation on gpu - there are couple of more algorithms (ndcg and map) for which support will be added as follow-up pr's - with no label groups defined, get gradient is 90x faster on gpu (120m instance mortgage dataset) - it can perform by an order of magnitude faster with ~ 10 groups (and adequate cores for the cpu implementation) * Add JSON config to rank obj.	2019-10-22 23:40:07 -04:00
Jiaming Yuan	5620322a48	[Breaking] Add global versioning. (#4936 ) * Use CMake config file for representing version. * Generate c and Python version file with CMake. The generated file is written into source tree. But unless XGBoost upgrades its version, there will be no actual modification. This retains compatibility with Makefiles for R. * Add XGBoost version the DMatrix binaries. * Simplify prefetch detection in CMakeLists.txt	2019-10-22 23:27:26 -04:00
Jiaming Yuan	7e477a2adb	Fix data loading (#4862 ) * Fix loading text data. * Fix config regex. * Try to explain the error better in exception. * Update doc.	2019-10-22 12:33:14 -04:00
Philip Hyunsu Cho	95295ce026	[CI] Use latest dask (#4973 ) * Remove version spec, to use latest dask always	2019-10-22 07:00:13 -04:00
Philip Hyunsu Cho	741fbf47c4	[CI] Update lint configuration to support latest pylint convention (#4971 ) * Update lint configuration * Use gcc 8 consistently in build instruction	2019-10-21 16:40:57 -07:00
Jiaming Yuan	4771bb0d41	Catch exception in transform function omp context. (#4960 )	2019-10-21 17:03:38 +08:00
Jiaming Yuan	010b8f1428	Revert "[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876 )" (#4965 ) This reverts commit `86ed01c4bb`.	2019-10-18 14:02:35 -07:00
Chen Qin	86ed01c4bb	[jvm-packages] update rabit, surface new changes to spark, add parity and failure tests (#4876 ) * Expose sets of rabit configurations to spark layer	2019-10-18 15:07:31 -04:00
Jiaming Yuan	31030a8d3a	Set correct file permission. (#4964 )	2019-10-18 12:54:29 -04:00
Jiaming Yuan	ae536756ae	Add Model and Configurable interface. (#4945 ) * Apply Configurable to objective functions. * Apply Model to Learner and Regtree, gbm. * Add Load/SaveConfig to objs. * Refactor obj tests to use smart pointer. * Dummy methods for Save/Load Model.	2019-10-18 01:56:02 -04:00
Jiaming Yuan	9fc681001a	Copy CMake parameter from dmlc-core. (#4948 )	2019-10-17 23:46:32 -04:00
Jacob Kim	a78d4e7aa8	Follow PEP 257 -- Docstring Conventions (#4959 )	2019-10-17 23:45:25 -04:00
Rory Mitchell	60748b2071	Use heuristic to select histogram node, avoid rabit call (#4951 )	2019-10-18 11:33:54 +13:00
Jiaming Yuan	185e3f1916	Update GPU doc. (#4953 )	2019-10-16 05:54:09 -04:00
Jiaming Yuan	7e72a12871	Don't `set_params` at the end of `set_state`. (#4947 ) * Don't set_params at the end of set_state. * Also fix another issue found in dask prediction. * Add note about prediction. Don't support other prediction modes at the moment.	2019-10-15 10:08:26 -04:00
Jiaming Yuan	2ebdec8aa6	Fix dask prediction. (#4941 ) * Fix dask prediction. * Add better error messages for wrong partition.	2019-10-14 23:19:34 -04:00
Jiaming Yuan	b61d534472	Span: use `size_t' for index_type, add` front' and `back'. (#4935 ) * Use `size_t' for index_type. Add `front' and `back'. * Remove a batch of `static_cast'.	2019-10-14 09:13:33 -04:00
Peter Badida	a9053aff83	Fix incorrectly displayed Note in the doc (#4943 )	2019-10-14 03:45:23 -04:00
Jiaming Yuan	0e0849fa1e	Mention dask in readme. [skip ci] (#4942 )	2019-10-14 03:44:08 -04:00
Jiaming Yuan	3d46bd0fa5	Ignore columnar alignment requirement. (#4928 ) * Better error message for wrong type. * Fix stride size.	2019-10-13 06:41:43 -04:00
Yuan Tang	05d4751540	Update README.md (#4940 )	2019-10-13 02:37:19 -04:00
Yuan Tang	08ff510e48	Mention Kubernetes on README (#4939 )	2019-10-13 01:43:09 -04:00
Philip Hyunsu Cho	f7487e4c2a	[CI] Run cuDF tests in Jenkins CI server (#4927 )	2019-10-13 00:04:54 -04:00
Philip Hyunsu Cho	5b4f28cc46	[CI] Raise timeout threshold in Jenkins (#4938 )	2019-10-12 23:47:35 -04:00
Jiaming Yuan	4bbf062ed3	[Breaking] Update sklearn interface. (#4929 ) * Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909. * Check data shape. Fix #4896. * Check element of eval_set is tuple. Fix #4875 * Add doc for random_state with hogwild. Fixes #4919	2019-10-12 02:50:09 -04:00
Jiaming Yuan	c2cce4fac3	Update dmlc-core. (#4924 ) * Fixed some threading errors. * Allow updating parameters.	2019-10-09 23:16:45 -04:00
Jiaming Yuan	6c9b6f11da	Use `cudf.concat` explicitly. (#4918 ) * Use `cudf.concat` explicitly. * Add test.	2019-10-10 16:02:10 +13:00
Rory Mitchell	aefb1e5c2f	Resolve dask performance issues (#4914 ) * Set dask client.map as impure function * Remove nrows * Remove slow check in verbose mode	2019-10-10 16:01:30 +13:00
Oleksandr Pryimak	80977182c5	Use bundled gtest (#4900 ) * Suggest to use gtest bundled with dmlc * Use dmlc bundled gtest in all CI scripts * Make clang-tidy to use dmlc embedded gtest	2019-10-09 16:26:19 -07:00
Jiaming Yuan	095de3bf5f	Export c++ headers in CMake installation. (#4897 ) * Move get transpose into cc. * Clean up headers in host device vector, remove thrust dependency. * Move span and host device vector into public. * Install c++ headers. * Short notes for c and c++. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-10-06 23:53:09 -04:00
Jiaming Yuan	4ab1df5fe6	Check deprecated `n_gpus`. (#4908 )	2019-10-02 02:05:14 -04:00
Jiaming Yuan	7e24a8d245	Improve doc and demo for dask. (#4907 ) * Add a readme with link to doc. * Add more comments in the demonstrations code. * Workaround https://github.com/dask/distributed/issues/3081 .	2019-09-30 23:59:37 -04:00
Jiaming Yuan	d30e63a0a5	Support feature names/types for cudf. (#4902 ) * Implement most of the pandas procedure for cudf except for type conversion. * Requires an array of interfaces in metainfo.	2019-09-29 15:07:51 -04:00
Vibhu Jawa	2fa8b359e0	Add support for cudf.Series (#4891 )	2019-09-25 23:52:28 -04:00
Liangcai Li	82ee2317e8	Add case for LongParam. (#4885 ) To support specifying long parameter as String, the same as other basic type, such as Int, Double ...	2019-09-25 05:41:53 -07:00
Jiaming Yuan	b8433c455a	Rewrite Dask interface. (#4819 )	2019-09-25 01:30:14 -04:00

1 2 3 4 5 ...

4035 Commits