xgboost

Author	SHA1	Message	Date
sriramch	310fe60b35	Pairwise ranking objective implementation on gpu (#4873 ) * - pairwise ranking objective implementation on gpu - there are couple of more algorithms (ndcg and map) for which support will be added as follow-up pr's - with no label groups defined, get gradient is 90x faster on gpu (120m instance mortgage dataset) - it can perform by an order of magnitude faster with ~ 10 groups (and adequate cores for the cpu implementation) * Add JSON config to rank obj.	2019-10-22 23:40:07 -04:00
Jiaming Yuan	5620322a48	[Breaking] Add global versioning. (#4936 ) * Use CMake config file for representing version. * Generate c and Python version file with CMake. The generated file is written into source tree. But unless XGBoost upgrades its version, there will be no actual modification. This retains compatibility with Makefiles for R. * Add XGBoost version the DMatrix binaries. * Simplify prefetch detection in CMakeLists.txt	2019-10-22 23:27:26 -04:00
Jiaming Yuan	7e477a2adb	Fix data loading (#4862 ) * Fix loading text data. * Fix config regex. * Try to explain the error better in exception. * Update doc.	2019-10-22 12:33:14 -04:00
Jiaming Yuan	4771bb0d41	Catch exception in transform function omp context. (#4960 )	2019-10-21 17:03:38 +08:00
Jiaming Yuan	31030a8d3a	Set correct file permission. (#4964 )	2019-10-18 12:54:29 -04:00
Jiaming Yuan	ae536756ae	Add Model and Configurable interface. (#4945 ) * Apply Configurable to objective functions. * Apply Model to Learner and Regtree, gbm. * Add Load/SaveConfig to objs. * Refactor obj tests to use smart pointer. * Dummy methods for Save/Load Model.	2019-10-18 01:56:02 -04:00
Rory Mitchell	60748b2071	Use heuristic to select histogram node, avoid rabit call (#4951 )	2019-10-18 11:33:54 +13:00
Jiaming Yuan	2ebdec8aa6	Fix dask prediction. (#4941 ) * Fix dask prediction. * Add better error messages for wrong partition.	2019-10-14 23:19:34 -04:00
Jiaming Yuan	b61d534472	Span: use `size_t' for index_type, add` front' and `back'. (#4935 ) * Use `size_t' for index_type. Add `front' and `back'. * Remove a batch of `static_cast'.	2019-10-14 09:13:33 -04:00
Jiaming Yuan	3d46bd0fa5	Ignore columnar alignment requirement. (#4928 ) * Better error message for wrong type. * Fix stride size.	2019-10-13 06:41:43 -04:00
Jiaming Yuan	4bbf062ed3	[Breaking] Update sklearn interface. (#4929 ) * Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909. * Check data shape. Fix #4896. * Check element of eval_set is tuple. Fix #4875 * Add doc for random_state with hogwild. Fixes #4919	2019-10-12 02:50:09 -04:00
Rory Mitchell	aefb1e5c2f	Resolve dask performance issues (#4914 ) * Set dask client.map as impure function * Remove nrows * Remove slow check in verbose mode	2019-10-10 16:01:30 +13:00
Jiaming Yuan	095de3bf5f	Export c++ headers in CMake installation. (#4897 ) * Move get transpose into cc. * Clean up headers in host device vector, remove thrust dependency. * Move span and host device vector into public. * Install c++ headers. * Short notes for c and c++. Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2019-10-06 23:53:09 -04:00
Jiaming Yuan	4ab1df5fe6	Check deprecated `n_gpus`. (#4908 )	2019-10-02 02:05:14 -04:00
Jiaming Yuan	d30e63a0a5	Support feature names/types for cudf. (#4902 ) * Implement most of the pandas procedure for cudf except for type conversion. * Requires an array of interfaces in metainfo.	2019-09-29 15:07:51 -04:00
Rong Ou	562bb0ae31	remove device shards (#4867 )	2019-09-25 13:15:46 +08:00
Jiaming Yuan	0b89cd1dfa	Support gamma in GPU_Hist. (#4874 ) * Just prevent building the tree instead of using an explicit pruner.	2019-09-24 10:16:08 +08:00
Jiaming Yuan	a40b72d127	Workaround `isnan` across different environments. (#4883 )	2019-09-23 21:34:27 -04:00
Jiaming Yuan	57106a3459	Fix parsing empty json object. (#4868 ) * Fix parsing empty json object. * Better error message.	2019-09-18 03:31:46 -04:00
Jiaming Yuan	d669ea1eaa	Deprecate set group (#4864 ) * Convert jvm package and R package. * Restore for compatibility.	2019-09-17 21:26:54 -04:00
Jiaming Yuan	5374f52531	Complete cudf support. (#4850 ) * Handles missing value. * Accept all floating point and integer types. * Move to cudf 9.0 API. * Remove requirement on `null_count`. * Arbitrary column types support.	2019-09-16 23:52:00 -04:00
Rong Ou	125bcec62e	Move ellpack page construction into DMatrix (#4833 )	2019-09-16 23:50:55 -04:00
Chen Qin	512f037e55	[rabit_bootstrap_cache ] failed xgb worker recover from other workers (#4808 ) * Better recovery support. Restarting only the failed workers.	2019-09-16 23:31:52 -04:00
Xu Xiao	c89bcc4de5	[blocking] fix parallel eval_split of hist updater (#4851 ) * Don't call rabit functions inside parallel loop.	2019-09-13 09:35:03 -04:00
Jiaming Yuan	f90e7f9aa8	Some comments for row partitioner. (#4832 )	2019-09-06 03:01:42 -04:00
Jiaming Yuan	a5f232feb8	Fix calling GPU predictor (#4836 ) * Fix calling GPU predictor	2019-09-05 19:09:38 -04:00
Jiaming Yuan	52d44e07fe	monitor for distributed envorinment. (#4829 ) * Collect statistics from other ranks in monitor. * Workaround old GCC bug.	2019-09-05 13:18:09 +08:00
Jiaming Yuan	c0fbeff0ab	Restrict access to `cfg_` in gbm. (#4801 ) * Restrict access to `cfg_` in gbm. * Verify having correct updaters. * Remove `grow_global_histmaker` This updater is the same as `grow_histmaker`. The former is not in our document so we just remove it.	2019-09-02 00:43:19 -04:00
TinkleG	2aed0ae230	Fix auc error in distributed mode (#4798 ) Need more work for a complete fix. See #4663 .	2019-09-01 02:54:14 -04:00
Rong Ou	733ed24dd9	further cleanup of single process multi-GPU code (#4810 ) * use subspan in gpu predictor instead of copying * Revise `HostDeviceVector`	2019-08-30 05:27:23 -04:00
Rong Ou	38ab79f889	Make HostDeviceVector single gpu only (#4773 ) * Make HostDeviceVector single gpu only	2019-08-26 09:51:13 +12:00
Jiaming Yuan	fba298fecb	Prevent copying data to host. (#4795 )	2019-08-20 23:06:27 -04:00
Jiaming Yuan	9700776597	Cudf support. (#4745 ) * Initial support for cudf integration. * Add two C APIs for consuming data and metainfo. * Add CopyFrom for SimpleCSRSource as a generic function to consume the data. * Add FromDeviceColumnar for consuming device data. * Add new MetaInfo::SetInfo for consuming label, weight etc.	2019-08-19 16:51:40 +12:00
Jiaming Yuan	ab357dd41c	Remove plugin, cuda related code in automake & autoconf files (#4789 ) * Build plugin example with CMake. * Remove plugin, cuda related code in automake & autoconf files. * Fix typo in GPU doc.	2019-08-18 16:54:34 -04:00
Jiaming Yuan	c358d95c44	Remove initializing stringstream reference. (#4788 )	2019-08-18 09:59:47 -04:00
Jiaming Yuan	c81238b5c4	Clean up after removing `gpu_exact`. (#4777 ) * Removed unused functions. * Removed unused parameters. * Move ValueConstraints into constraints.cuh since it's now only used in GPU_Hist.	2019-08-17 01:05:57 -04:00
Xu Xiao	ef9af33a00	[HOTFIX] distributed training with hist method (#4716 ) * add parallel test for hist.EvalualiteSplit * update test_openmp.py * update test_openmp.py * update test_openmp.py * update test_openmp.py * update test_openmp.py * fix OMP schedule policy * fix clang-tidy * add logging: total_num_bins * fix * fix * test * replace guided OPENMP policy with static in updater_quantile_hist.cc	2019-08-13 11:27:29 -07:00
Jiaming Yuan	c0ffe65f5c	Mimic cuda assert output in span check. (#4762 )	2019-08-13 01:44:54 -04:00
Rong Ou	c5b229632d	[BREAKING] prevent multi-gpu usage (#4749 ) * prevent multi-gpu usage * fix distributed test * combine gpu predictor tests * set upper bound on n_gpus	2019-08-13 09:11:35 +12:00
sriramch	198f3a6c4a	Enable natural copies of the batch iterators without the need of the clone method (#4748 ) - the synthesized copy constructor should do the appropriate job	2019-08-09 11:47:35 -04:00
Rong Ou	19f9fd5de9	remove the qids_ field in MetaInfo (#4744 )	2019-08-08 10:01:59 +08:00
Rong Ou	602484e19f	Remove some unused functions as reported by cppcheck (#4743 )	2019-08-07 02:42:33 -04:00
Bobby	3e2c472944	Fix model parameter recovery (#4738 )	2019-08-07 02:32:10 -04:00
Rong Ou	851b5b3808	Remove gpu_exact tree method (#4742 )	2019-08-07 11:43:20 +12:00
Jiaming Yuan	2a4df8e29f	Add Json integer, remove specialization. (#4739 )	2019-08-06 03:10:49 -04:00
Jiaming Yuan	9c469b3844	Move bitfield into common. (#4737 ) * Prepare for columnar format support.	2019-08-06 02:49:32 -04:00
Rong Ou	6edddd7966	Refactor DMatrix to return batches of different page types (#4686 ) * Use explicit template parameter for specifying page type.	2019-08-03 15:10:34 -04:00
Jiaming Yuan	d2e1e4d5b4	A simple Json implementation for future use. (#4708 ) * A simple Json implementation for future use.	2019-07-29 21:17:27 -04:00
Jiaming Yuan	001aaaee5f	Removed deprecated gpu objectives. (#4690 )	2019-07-20 23:18:34 -04:00
Jiaming Yuan	f0064c07ab	Refactor configuration [Part II]. (#4577 ) * Refactor configuration [Part II]. * General changes: Remove `Init` methods to avoid ambiguity. Remove `Configure(std::map<>)` to avoid redundant copying and prepare for parameter validation. (`std::vector` is returned from `InitAllowUnknown`). ** Add name to tree updaters for easier debugging. * Learner changes: Make `LearnerImpl` the only source of configuration. All configurations are stored and carried out by `LearnerImpl::Configure()`. Remove booster in C API. Originally kept for "compatibility reason", but did not state why. So here we just remove it. Add a `metric_names_` field in `LearnerImpl`. Remove `LazyInit`. Configuration will always be lazy. ** Run `Configure` before every iteration. * Predictor changes: Allocate both cpu and gpu predictor. Remove cpu_predictor from gpu_predictor. `GBTree` is now used to dispatch the predictor. ** Remove some GPU Predictor tests. * IO No IO changes. The binary model format stability is tested by comparing hashing value of save models between two commits	2019-07-20 08:34:56 -04:00

... 8 9 10 11 12 ...

1198 Commits