Jiaming Yuan
4ee8340e79
Support column major array. ( #6765 )
2021-03-20 05:19:46 +08:00
Jiaming Yuan
f20074e826
Check for invalid data. ( #6742 )
2021-03-04 14:37:20 +08:00
Jiaming Yuan
1e949110da
Use generic dispatching routine for array interface. ( #6672 )
2021-02-05 09:23:38 +08:00
Jiaming Yuan
f2f7dd87b8
Use view for SparsePage exclusively. ( #6590 )
2021-01-11 18:04:55 +08:00
Jiaming Yuan
80065d571e
[dask] Add DaskXGBRanker ( #6576 )
...
* Initial support for distributed LTR using dask.
* Support `qid` in libxgboost.
* Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]`
to avoid duplicated code.
* Define `DaskXGBRanker`.
The dask ranker doesn't support group structure, instead it uses query id and
convert to group ptr internally.
2021-01-08 18:35:09 +08:00
Jiaming Yuan
c120822a24
Fix flaky sparse page dmatrix test. ( #6417 )
2020-11-20 19:15:45 +08:00
Jiaming Yuan
43efadea2e
Deterministic data partitioning for external memory ( #6317 )
...
* Make external memory data partitioning deterministic.
* Change the meaning of `page_size` from bytes to number of rows.
* Design a data pool.
* Note for external memory.
* Enable unity build on Windows CI.
* Force garbage collect on test.
2020-11-11 06:11:06 +08:00
Jiaming Yuan
bed7ae4083
Loop over thrust::reduce. ( #6229 )
...
* Check input chunk size of dqdm.
* Add doc for current limitation.
2020-10-14 10:40:56 +13:00
Jiaming Yuan
14afdb4d92
Support categorical data in ellpack. ( #6140 )
2020-09-24 19:28:57 +08:00
Philip Hyunsu Cho
487ab0ce73
[BLOCKING] Handle empty rows in data iterators correctly ( #5929 )
...
* [jvm-packages] Handle empty rows in data iterators correctly
* Fix clang-tidy error
* last empty row
* Add comments [skip ci]
Co-authored-by: Nan Zhu <nanzhu@uber.com>
2020-07-25 13:46:19 -07:00
Jiaming Yuan
a3ec964346
Accept iterator in device dmatrix. ( #5783 )
...
* Remove Device DMatrix.
2020-07-07 21:44:48 +08:00
Jiaming Yuan
93c44a9a64
Move feature names and types of DMatrix from Python to C++. ( #5858 )
...
* Add thread local return entry for DMatrix.
* Save feature name and feature type in binary file.
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2020-07-07 09:40:13 +08:00
Jiaming Yuan
1a0801238e
Implement iterative DMatrix. ( #5837 )
2020-07-03 11:44:52 +08:00
Jiaming Yuan
90a9c68874
Implement a DMatrix Proxy. ( #5803 )
2020-06-29 15:03:10 +08:00
Jiaming Yuan
47c89775d6
Accept string for ArrayInterface constructor. ( #5799 )
2020-06-27 00:06:54 +08:00
Jiaming Yuan
c4d721200a
Implement extend method for meta info. ( #5800 )
...
* Implement extend for host device vector.
2020-06-20 03:32:03 +08:00
Jiaming Yuan
38ee514787
Implement fast number serialization routines. ( #5772 )
...
* Implement ryu algorithm.
* Implement integer printing.
* Full coverage roundtrip test.
2020-06-17 12:39:23 +08:00
fis
7c3a168ffd
Revert "Accept string for ArrayInterface constructor."
...
This reverts commit e8ecafb8dc628f45b75b4c2844a236d27e0a6d98.
2020-06-16 20:02:35 +08:00
fis
e8ecafb8dc
Accept string for ArrayInterface constructor.
2020-06-16 20:00:24 +08:00
Rory Mitchell
b47b5ac771
Use hypothesis ( #5759 )
...
* Use hypothesis
* Allow int64 array interface for groups
* Add packages to Windows CI
* Add to travis
* Make sure device index is set correctly
* Fix dask-cudf test
* appveyor
2020-06-16 12:45:59 +12:00
Jiaming Yuan
306e38ff31
Avoid including c_api.h in header files. ( #5782 )
2020-06-12 16:24:24 +08:00
Jiaming Yuan
cacff9232a
Remove column major specialization. ( #5755 )
...
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-06-05 16:19:14 +08:00
Jiaming Yuan
8438c7d0e4
Fix IsDense. ( #5702 )
2020-05-26 08:24:37 +08:00
Jiaming Yuan
eaf2a00b5c
Enhance nvtx support. ( #5636 )
2020-05-06 22:54:24 +08:00
Jiaming Yuan
e726dd9902
Set device in device dmatrix. ( #5596 )
2020-04-25 13:42:53 +08:00
Jiaming Yuan
29a4cfe400
Group aware GPU sketching. ( #5551 )
...
* Group aware GPU weighted sketching.
* Distribute group weights to each data point.
* Relax the test.
* Validate input meta info.
* Fix metainfo copy ctor.
2020-04-20 17:18:52 +08:00
Jiaming Yuan
e1f22baf8c
Fix slice and get info. ( #5552 )
2020-04-18 18:00:13 +08:00
Rory Mitchell
ca4e05660e
Purge device_helpers.cuh ( #5534 )
...
* Simplifications with caching_device_vector
* Purge device helpers
2020-04-15 21:51:56 +12:00
Jiaming Yuan
6671b42dd4
Use ellpack for prediction only when sparsepage doesn't exist. ( #5504 )
2020-04-10 12:15:46 +08:00
Jiaming Yuan
0012f2ef93
Upgrade clang-tidy on CI. ( #5469 )
...
* Correct all clang-tidy errors.
* Upgrade clang-tidy to 10 on CI.
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-04-05 04:42:29 +08:00
Jiaming Yuan
459b175dc6
Split up test helpers header. ( #5455 )
2020-04-03 10:36:53 +08:00
Jiaming Yuan
29c6ad943a
Prevent copying SimpleDMatrix. ( #5453 )
...
* Set default dtor for SimpleDMatrix to initialize default copy ctor, which is
deleted due to unique ptr.
* Remove commented code.
* Remove warning for calling host function (std::max).
* Remove warning for initialization order.
* Remove warning for unused variables.
2020-04-02 07:01:49 +08:00
Rory Mitchell
13b10a6370
Device dmatrix ( #5420 )
2020-03-28 14:42:21 +13:00
Jiaming Yuan
4942da64ae
Refactor tests with data generator. ( #5439 )
2020-03-27 06:44:44 +08:00
Rory Mitchell
b745b7acce
Fix memory usage of device sketching ( #5407 )
2020-03-14 13:43:24 +13:00
Rory Mitchell
3ad4333b0e
Partial rewrite EllpackPage ( #5352 )
2020-03-11 10:15:53 +13:00
Rory Mitchell
a38e7bd19c
Sketching from adapters ( #5365 )
...
* Sketching from adapters
* Add weights test
2020-03-07 21:07:58 +13:00
Jiaming Yuan
f2b8cd2922
Add number of columns to native data iterator. ( #5202 )
...
* Change native data iter into an adapter.
2020-02-25 23:42:01 +08:00
Rory Mitchell
b0ed3f0a66
Remove unnecessary DMatrix methods ( #5324 )
2020-02-25 12:40:39 +13:00
Jiaming Yuan
655cf17b60
Predict on Ellpack. ( #5327 )
...
* Unify GPU prediction node.
* Add `PageExists`.
* Dispatch prediction on input data for GPU Predictor.
2020-02-23 06:27:03 +08:00
Rory Mitchell
bc96ceb8b2
Refactor SparsePageSource, delete cache files after use ( #5321 )
...
* Refactor sparse page source
* Delete temporary cache files
* Log fatal if cache exists
* Log fatal if multiple threads used with prefetcher
2020-02-19 16:43:41 +13:00
Rory Mitchell
b2b2c4e231
Remove SimpleCSRSource ( #5315 )
2020-02-18 16:49:17 +13:00
Rong Ou
e4b74c4d22
Gradient based sampling for GPU Hist ( #5093 )
...
* Implement gradient based sampling for GPU Hist tree method.
* Add samplers and handle compacted page in GPU Hist.
2020-02-04 10:31:27 +08:00
Jiaming Yuan
fe8d72b50b
Cleanup warnings. ( #5247 )
...
From clang-tidy-9 and gcc-7: Invalid case style, narrowing definition, wrong
initialization order, unused variables.
2020-01-31 14:52:15 +08:00
Philip Hyunsu Cho
44469a0ca9
Extensible binary serialization format for DMatrix::MetaInfo ( #5187 )
...
* Turn xgboost::DataType into C++11 enum class
* New binary serialization format for DMatrix::MetaInfo
* Fix clang-tidy
* Fix c++ test
* Implement new format proposal
* Move helper functions to anonymous namespace; remove unneeded field
* Fix lint
* Add shape.
* Keep only roundtrip test.
* Fix test.
* various fixes
* Update data.cc
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2020-01-23 11:33:17 -08:00
Rory Mitchell
9c56480c61
Support dmatrix construction from cupy array ( #5206 )
2020-01-22 13:15:27 +13:00
Rory Mitchell
a73e25e15f
Implement slice via adapters ( #5198 )
2020-01-14 12:55:41 +13:00
Rory Mitchell
8cbcc53ccb
Remove old cudf constructor code ( #5194 )
2020-01-10 16:35:23 +13:00
Rory Mitchell
87ebfc1315
Implement cudf construction with adapters. ( #5189 )
2020-01-09 20:23:06 +13:00
Jiaming Yuan
61286c6e8f
Fix wrapping GPU ID and prevent data copying. ( #5160 )
...
* Removed some data copying.
* Make sure gpu_id is valid before any configuration is carried out.
2019-12-27 16:51:08 +08:00