Jiaming Yuan
4771bb0d41
Catch exception in transform function omp context. ( #4960 )
2019-10-21 17:03:38 +08:00
Jiaming Yuan
31030a8d3a
Set correct file permission. ( #4964 )
2019-10-18 12:54:29 -04:00
Jiaming Yuan
ae536756ae
Add Model and Configurable interface. ( #4945 )
...
* Apply Configurable to objective functions.
* Apply Model to Learner and Regtree, gbm.
* Add Load/SaveConfig to objs.
* Refactor obj tests to use smart pointer.
* Dummy methods for Save/Load Model.
2019-10-18 01:56:02 -04:00
Rory Mitchell
60748b2071
Use heuristic to select histogram node, avoid rabit call ( #4951 )
2019-10-18 11:33:54 +13:00
Jiaming Yuan
2ebdec8aa6
Fix dask prediction. ( #4941 )
...
* Fix dask prediction.
* Add better error messages for wrong partition.
2019-10-14 23:19:34 -04:00
Jiaming Yuan
b61d534472
Span: use size_t' for index_type, add front' and `back'. ( #4935 )
...
* Use `size_t' for index_type. Add `front' and `back'.
* Remove a batch of `static_cast'.
2019-10-14 09:13:33 -04:00
Jiaming Yuan
3d46bd0fa5
Ignore columnar alignment requirement. ( #4928 )
...
* Better error message for wrong type.
* Fix stride size.
2019-10-13 06:41:43 -04:00
Jiaming Yuan
4bbf062ed3
[Breaking] Update sklearn interface. ( #4929 )
...
* Remove nthread, seed, silent. Add tree_method, gpu_id, num_parallel_tree. Fix #4909 .
* Check data shape. Fix #4896 .
* Check element of eval_set is tuple. Fix #4875
* Add doc for random_state with hogwild. Fixes #4919
2019-10-12 02:50:09 -04:00
Rory Mitchell
aefb1e5c2f
Resolve dask performance issues ( #4914 )
...
* Set dask client.map as impure function
* Remove nrows
* Remove slow check in verbose mode
2019-10-10 16:01:30 +13:00
Jiaming Yuan
095de3bf5f
Export c++ headers in CMake installation. ( #4897 )
...
* Move get transpose into cc.
* Clean up headers in host device vector, remove thrust dependency.
* Move span and host device vector into public.
* Install c++ headers.
* Short notes for c and c++.
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2019-10-06 23:53:09 -04:00
Jiaming Yuan
4ab1df5fe6
Check deprecated n_gpus. ( #4908 )
2019-10-02 02:05:14 -04:00
Jiaming Yuan
d30e63a0a5
Support feature names/types for cudf. ( #4902 )
...
* Implement most of the pandas procedure for cudf except for type conversion.
* Requires an array of interfaces in metainfo.
2019-09-29 15:07:51 -04:00
Rong Ou
562bb0ae31
remove device shards ( #4867 )
2019-09-25 13:15:46 +08:00
Jiaming Yuan
0b89cd1dfa
Support gamma in GPU_Hist. ( #4874 )
...
* Just prevent building the tree instead of using an explicit pruner.
2019-09-24 10:16:08 +08:00
Jiaming Yuan
a40b72d127
Workaround isnan across different environments. ( #4883 )
2019-09-23 21:34:27 -04:00
Jiaming Yuan
57106a3459
Fix parsing empty json object. ( #4868 )
...
* Fix parsing empty json object.
* Better error message.
2019-09-18 03:31:46 -04:00
Jiaming Yuan
d669ea1eaa
Deprecate set group ( #4864 )
...
* Convert jvm package and R package.
* Restore for compatibility.
2019-09-17 21:26:54 -04:00
Jiaming Yuan
5374f52531
Complete cudf support. ( #4850 )
...
* Handles missing value.
* Accept all floating point and integer types.
* Move to cudf 9.0 API.
* Remove requirement on `null_count`.
* Arbitrary column types support.
2019-09-16 23:52:00 -04:00
Rong Ou
125bcec62e
Move ellpack page construction into DMatrix ( #4833 )
2019-09-16 23:50:55 -04:00
Chen Qin
512f037e55
[rabit_bootstrap_cache ] failed xgb worker recover from other workers ( #4808 )
...
* Better recovery support. Restarting only the failed workers.
2019-09-16 23:31:52 -04:00
Xu Xiao
c89bcc4de5
[blocking] fix parallel eval_split of hist updater ( #4851 )
...
* Don't call rabit functions inside parallel loop.
2019-09-13 09:35:03 -04:00
Jiaming Yuan
f90e7f9aa8
Some comments for row partitioner. ( #4832 )
2019-09-06 03:01:42 -04:00
Jiaming Yuan
a5f232feb8
Fix calling GPU predictor ( #4836 )
...
* Fix calling GPU predictor
2019-09-05 19:09:38 -04:00
Jiaming Yuan
52d44e07fe
monitor for distributed envorinment. ( #4829 )
...
* Collect statistics from other ranks in monitor.
* Workaround old GCC bug.
2019-09-05 13:18:09 +08:00
Jiaming Yuan
c0fbeff0ab
Restrict access to cfg_ in gbm. ( #4801 )
...
* Restrict access to `cfg_` in gbm.
* Verify having correct updaters.
* Remove `grow_global_histmaker`
This updater is the same as `grow_histmaker`. The former is not in our
document so we just remove it.
2019-09-02 00:43:19 -04:00
TinkleG
2aed0ae230
Fix auc error in distributed mode ( #4798 )
...
Need more work for a complete fix. See #4663 .
2019-09-01 02:54:14 -04:00
Rong Ou
733ed24dd9
further cleanup of single process multi-GPU code ( #4810 )
...
* use subspan in gpu predictor instead of copying
* Revise `HostDeviceVector`
2019-08-30 05:27:23 -04:00
Rong Ou
38ab79f889
Make HostDeviceVector single gpu only ( #4773 )
...
* Make HostDeviceVector single gpu only
2019-08-26 09:51:13 +12:00
Jiaming Yuan
fba298fecb
Prevent copying data to host. ( #4795 )
2019-08-20 23:06:27 -04:00
Jiaming Yuan
9700776597
Cudf support. ( #4745 )
...
* Initial support for cudf integration.
* Add two C APIs for consuming data and metainfo.
* Add CopyFrom for SimpleCSRSource as a generic function to consume the data.
* Add FromDeviceColumnar for consuming device data.
* Add new MetaInfo::SetInfo for consuming label, weight etc.
2019-08-19 16:51:40 +12:00
Jiaming Yuan
ab357dd41c
Remove plugin, cuda related code in automake & autoconf files ( #4789 )
...
* Build plugin example with CMake.
* Remove plugin, cuda related code in automake & autoconf files.
* Fix typo in GPU doc.
2019-08-18 16:54:34 -04:00
Jiaming Yuan
c358d95c44
Remove initializing stringstream reference. ( #4788 )
2019-08-18 09:59:47 -04:00
Jiaming Yuan
c81238b5c4
Clean up after removing gpu_exact. ( #4777 )
...
* Removed unused functions.
* Removed unused parameters.
* Move ValueConstraints into constraints.cuh since it's now only used in GPU_Hist.
2019-08-17 01:05:57 -04:00
Xu Xiao
ef9af33a00
[HOTFIX] distributed training with hist method ( #4716 )
...
* add parallel test for hist.EvalualiteSplit
* update test_openmp.py
* update test_openmp.py
* update test_openmp.py
* update test_openmp.py
* update test_openmp.py
* fix OMP schedule policy
* fix clang-tidy
* add logging: total_num_bins
* fix
* fix
* test
* replace guided OPENMP policy with static in updater_quantile_hist.cc
2019-08-13 11:27:29 -07:00
Jiaming Yuan
c0ffe65f5c
Mimic cuda assert output in span check. ( #4762 )
2019-08-13 01:44:54 -04:00
Rong Ou
c5b229632d
[BREAKING] prevent multi-gpu usage ( #4749 )
...
* prevent multi-gpu usage
* fix distributed test
* combine gpu predictor tests
* set upper bound on n_gpus
2019-08-13 09:11:35 +12:00
sriramch
198f3a6c4a
Enable natural copies of the batch iterators without the need of the clone method ( #4748 )
...
- the synthesized copy constructor should do the appropriate job
2019-08-09 11:47:35 -04:00
Rong Ou
19f9fd5de9
remove the qids_ field in MetaInfo ( #4744 )
2019-08-08 10:01:59 +08:00
Rong Ou
602484e19f
Remove some unused functions as reported by cppcheck ( #4743 )
2019-08-07 02:42:33 -04:00
Bobby
3e2c472944
Fix model parameter recovery ( #4738 )
2019-08-07 02:32:10 -04:00
Rong Ou
851b5b3808
Remove gpu_exact tree method ( #4742 )
2019-08-07 11:43:20 +12:00
Jiaming Yuan
2a4df8e29f
Add Json integer, remove specialization. ( #4739 )
2019-08-06 03:10:49 -04:00
Jiaming Yuan
9c469b3844
Move bitfield into common. ( #4737 )
...
* Prepare for columnar format support.
2019-08-06 02:49:32 -04:00
Rong Ou
6edddd7966
Refactor DMatrix to return batches of different page types ( #4686 )
...
* Use explicit template parameter for specifying page type.
2019-08-03 15:10:34 -04:00
Jiaming Yuan
d2e1e4d5b4
A simple Json implementation for future use. ( #4708 )
...
* A simple Json implementation for future use.
2019-07-29 21:17:27 -04:00
Jiaming Yuan
001aaaee5f
Removed deprecated gpu objectives. ( #4690 )
2019-07-20 23:18:34 -04:00
Jiaming Yuan
f0064c07ab
Refactor configuration [Part II]. ( #4577 )
...
* Refactor configuration [Part II].
* General changes:
** Remove `Init` methods to avoid ambiguity.
** Remove `Configure(std::map<>)` to avoid redundant copying and prepare for
parameter validation. (`std::vector` is returned from `InitAllowUnknown`).
** Add name to tree updaters for easier debugging.
* Learner changes:
** Make `LearnerImpl` the only source of configuration.
All configurations are stored and carried out by `LearnerImpl::Configure()`.
** Remove booster in C API.
Originally kept for "compatibility reason", but did not state why. So here
we just remove it.
** Add a `metric_names_` field in `LearnerImpl`.
** Remove `LazyInit`. Configuration will always be lazy.
** Run `Configure` before every iteration.
* Predictor changes:
** Allocate both cpu and gpu predictor.
** Remove cpu_predictor from gpu_predictor.
`GBTree` is now used to dispatch the predictor.
** Remove some GPU Predictor tests.
* IO
No IO changes. The binary model format stability is tested by comparing
hashing value of save models between two commits
2019-07-20 08:34:56 -04:00
sriramch
7a388cbf8b
Modify caching allocator/vector and fix issues relating to inability to train large datasets ( #4615 )
2019-07-09 18:33:27 +12:00
Xu Xiao
cd1526d3b1
fix auc error in distributed mode caused by unbalanced dataset ( #4645 )
2019-07-08 16:01:52 +08:00
Jiaming Yuan
d9a47794a5
Fix CPU hist init for sparse dataset. ( #4625 )
...
* Fix CPU hist init for sparse dataset.
* Implement sparse histogram cut.
* Allow empty features.
* Fix windows build, don't use sparse in distributed environment.
* Comments.
* Smaller threshold.
* Fix windows omp.
* Fix msvc lambda capture.
* Fix MSVC macro.
* Fix MSVC initialization list.
* Fix MSVC initialization list x2.
* Preserve categorical feature behavior.
* Rename matrix to sparse cuts.
* Reuse UseGroup.
* Check for categorical data when adding cut.
Co-Authored-By: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
* Sanity check.
* Fix comments.
* Fix comment.
2019-07-04 16:27:03 -07:00