Jiaming Yuan
ac9bfaa4f2
Handle missing values in dataframe with category dtype. ( #7331 )
...
* Replace -1 in pandas initializer.
* Unify `IsValid` functor.
* Mimic pandas data handling in cuDF glue code.
* Check invalid categories.
* Fix DDM sketching.
2021-10-28 03:33:54 +08:00
Jiaming Yuan
31c1e13f90
Categorical data support in CPU sketching. ( #7221 )
2021-09-17 04:37:09 +08:00
Jiaming Yuan
2942dc68e4
Fix mixed types in GPU sketching. ( #7228 )
2021-09-16 00:10:25 +08:00
Jiaming Yuan
3039dd194b
Don't estimate sketch batch size when rmm is used. ( #6807 )
2021-03-31 15:29:56 +08:00
Jiaming Yuan
886486a519
Support categorical data in GPU weighted sketching. ( #6508 )
2020-12-16 14:23:28 +08:00
Jiaming Yuan
2241563f23
Handle duplicated values in sketching. ( #6178 )
...
* Accumulate weights in duplicated values.
* Fix device id in iterative dmatrix.
2020-10-10 19:32:44 +08:00
Jiaming Yuan
210c131ce7
Support categorical data in GPU sketching. ( #6137 )
2020-09-21 13:53:06 +08:00
Jiaming Yuan
ee70a2380b
Unify CPU hist sketching ( #5880 )
2020-08-12 01:33:06 +08:00
Jiaming Yuan
e471056ec4
Fix sketch size calculation. ( #5898 )
2020-07-17 08:33:16 +08:00
Jiaming Yuan
dd445af56e
Cleanup on device sketch. ( #5874 )
...
* Remove old functions.
* Merge weighted and un-weighted into a common interface.
2020-07-14 10:15:54 +08:00
Rong Ou
06320729d4
fix device sketch with weights in external memory mode ( #5870 )
2020-07-08 08:44:07 +08:00
Jiaming Yuan
048d969be4
Implement GK sketching on GPU. ( #5846 )
...
* Implement GK sketching on GPU.
* Strong tests on quantile building.
* Handle sparse dataset by binary searching the column index.
* Hypothesis test on dask.
2020-07-07 12:16:21 +08:00
Jiaming Yuan
38ee514787
Implement fast number serialization routines. ( #5772 )
...
* Implement ryu algorithm.
* Implement integer printing.
* Full coverage roundtrip test.
2020-06-17 12:39:23 +08:00
Jiaming Yuan
3028fa6b42
Implement weighted sketching for adapter. ( #5760 )
...
* Bounded memory tests.
* Fixed memory estimation.
2020-06-12 06:20:39 +08:00
Jiaming Yuan
e533908922
Expose device sketching in header. ( #5747 )
2020-06-02 13:02:53 +08:00
Jiaming Yuan
29a4cfe400
Group aware GPU sketching. ( #5551 )
...
* Group aware GPU weighted sketching.
* Distribute group weights to each data point.
* Relax the test.
* Validate input meta info.
* Fix metainfo copy ctor.
2020-04-20 17:18:52 +08:00
Jiaming Yuan
0012f2ef93
Upgrade clang-tidy on CI. ( #5469 )
...
* Correct all clang-tidy errors.
* Upgrade clang-tidy to 10 on CI.
Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>
2020-04-05 04:42:29 +08:00
Rory Mitchell
13b10a6370
Device dmatrix ( #5420 )
2020-03-28 14:42:21 +13:00
Rory Mitchell
b745b7acce
Fix memory usage of device sketching ( #5407 )
2020-03-14 13:43:24 +13:00
Rory Mitchell
a38e7bd19c
Sketching from adapters ( #5365 )
...
* Sketching from adapters
* Add weights test
2020-03-07 21:07:58 +13:00