36 Commits

Author SHA1 Message Date
amdsc21
5e8b1842b9 fix Pointer Attr 2023-03-10 19:06:02 +01:00
amdsc21
ccce4cf7e1 finish data.cu 2023-03-10 05:00:57 +01:00
Jiaming Yuan
c6a8754c62
Define CUDA Context. (#8604)
We will transition to non-default and non-blocking CUDA stream.
2022-12-20 15:15:07 +08:00
Jiaming Yuan
d48123d23b
Fix rmm build (#7973)
- Optionally switch to c++17
- Use rmm CMake target.
- Workaround compiler errors.
- Fix GPUMetric inheritance.
- Run death tests even if it's built with RMM support.

Co-authored-by: jakirkham <jakirkham@gmail.com>
2022-06-06 20:18:32 +08:00
Jiaming Yuan
64575591d8
Use context in SetInfo. (#7687)
* Use the name `Context`.
* Pass a context object into `SetInfo`.
* Add context to proxy matrix.
* Add context to iterative DMatrix.

This is to remove the use of the default number of threads during `SetInfo` as a follow-up on
removing the global omp variable while preparing for CUDA stream semantic.  Currently, XGBoost
uses the legacy CUDA stream, we will gradually remove them in the future in favor of non-blocking streams.
2022-03-24 22:16:26 +08:00
Jiaming Yuan
98d6faefd6
Implement slope for Pseduo-Huber. (#7727)
* Add objective and metric.
* Some refactoring for CPU/GPU dispatching using linalg module.
2022-03-14 21:42:38 +08:00
Jiaming Yuan
58a6723eb1
Initial support for multioutput regression. (#7514)
* Add num target model parameter, which is configured from input labels.
* Change elementwise metric and indexing for weights.
* Add demo.
* Add tests.
2021-12-18 09:28:38 +08:00
Jiaming Yuan
5b1161bb64
Convert labels into tensor. (#7456)
* Add a new ctor to tensor for `initilizer_list`.
* Change labels from host device vector to tensor.
* Rename the field from `labels_` to `labels` since it's a public member.
2021-12-17 00:58:35 +08:00
Jiaming Yuan
d33854af1b
[Breaking] Accept multi-dim meta info. (#7405)
This PR changes base_margin into a 3-dim array, with one of them being reserved for multi-target classification. Also, a breaking change is made for binary serialization due to extra dimension along with a fix for saving the feature weights. Lastly, it unifies the prediction initialization between CPU and GPU. After this PR, the meta info setter in Python will be based on array interface.
2021-11-18 23:02:54 +08:00
Jiaming Yuan
55ee272ea8
Extend array interface to handle ndarray. (#7434)
* Extend array interface to handle ndarray.

The `ArrayInterface` class is extended to support multi-dim array inputs. Previously this
class handles only 2-dim (vector is also matrix).  This PR specifies the expected
dimension at compile-time and the array interface can perform various checks automatically
for input data. Also, adapters like CSR are more rigorous about their input.  Lastly, row
vector and column vector are handled without intervention from the caller.
2021-11-16 09:52:15 +08:00
Jiaming Yuan
d4274bc556
Fix typo. (#7433) 2021-11-15 01:28:11 +08:00
Jiaming Yuan
a13321148a
Support multi-class with base margin. (#7381)
This is already partially supported but never properly tested. So the only possible way to use it is calling `numpy.ndarray.flatten` with `base_margin` before passing it into XGBoost. This PR adds proper support
for most of the data types along with tests.
2021-11-02 13:38:00 +08:00
Jiaming Yuan
d1f00fb0b7
Stricter validation for group. (#7345) 2021-10-21 12:13:33 +08:00
Jiaming Yuan
8a84be37b8
Pass scikit learn estimator checks for regressor. (#7130)
* Check data shape.
* Check labels.
2021-08-03 18:58:20 +08:00
Jiaming Yuan
e6088366df
Export Python Interface for external memory. (#7070)
* Add Python iterator interface.
* Add tests.
* Add demo.
* Add documents.
* Handle empty dataset.
2021-07-22 15:15:53 +08:00
Jiaming Yuan
bd1f3a38f0
Rewrite sparse dmatrix using callbacks. (#7092)
- Reduce dependency on dmlc parsers and provide an interface for users to load data by themselves.
- Remove use of threaded iterator and IO queue.
- Remove `page_size`.
- Make sure the number of pages in memory is bounded.
- Make sure the cache can not be violated.
- Provide an interface for internal algorithms to process data asynchronously.
2021-07-16 12:33:31 +08:00
Jiaming Yuan
1c8fdf2218
Remove use of device_idx in dh::LaunchN. (#7063)
It's an unused parameter, removing it can make the CI log more readable.
2021-06-29 11:37:26 +08:00
Jiaming Yuan
4ee8340e79
Support column major array. (#6765) 2021-03-20 05:19:46 +08:00
Jiaming Yuan
411592a347
Enhance inplace prediction. (#6653)
* Accept array interface for csr and array.
* Accept an optional proxy dmatrix for metainfo.

This constructs an explicit `_ProxyDMatrix` type in Python.

* Remove unused doc.
* Add strict output.
2021-02-02 11:41:46 +08:00
Jiaming Yuan
80065d571e
[dask] Add DaskXGBRanker (#6576)
* Initial support for distributed LTR using dask.

* Support `qid` in libxgboost.
* Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]`
  to avoid duplicated code.
* Define `DaskXGBRanker`.

The dask ranker doesn't support group structure, instead it uses query id and
convert to group ptr internally.
2021-01-08 18:35:09 +08:00
Jiaming Yuan
b5f52f0b1b
Validate weights are positive values. (#6115) 2020-09-15 09:03:55 +08:00
Jiaming Yuan
4d99c58a5f
Feature weights (#5962) 2020-08-18 19:55:41 +08:00
Jiaming Yuan
7c2686146e
Dask device dmatrix (#5901)
* Fix softprob with empty dmatrix.
2020-07-17 13:17:43 +08:00
Jiaming Yuan
47c89775d6
Accept string for ArrayInterface constructor. (#5799) 2020-06-27 00:06:54 +08:00
fis
7c3a168ffd Revert "Accept string for ArrayInterface constructor."
This reverts commit e8ecafb8dc628f45b75b4c2844a236d27e0a6d98.
2020-06-16 20:02:35 +08:00
fis
e8ecafb8dc Accept string for ArrayInterface constructor. 2020-06-16 20:00:24 +08:00
Rory Mitchell
b47b5ac771
Use hypothesis (#5759)
* Use hypothesis

* Allow int64 array interface for groups

* Add packages to Windows CI

* Add to travis

* Make sure device index is set correctly

* Fix dask-cudf test

* appveyor
2020-06-16 12:45:59 +12:00
Jiaming Yuan
306e38ff31
Avoid including c_api.h in header files. (#5782) 2020-06-12 16:24:24 +08:00
Rory Mitchell
13b10a6370
Device dmatrix (#5420) 2020-03-28 14:42:21 +13:00
Rory Mitchell
9c56480c61
Support dmatrix construction from cupy array (#5206) 2020-01-22 13:15:27 +13:00
Rory Mitchell
87ebfc1315
Implement cudf construction with adapters. (#5189) 2020-01-09 20:23:06 +13:00
Rory Mitchell
3d04a8cc97
Use dynamic types for array interface columns instead of templates (#5108) 2019-12-21 16:08:10 +13:00
Jiaming Yuan
d30e63a0a5
Support feature names/types for cudf. (#4902)
* Implement most of the pandas procedure for cudf except for type conversion.
* Requires an array of interfaces in metainfo.
2019-09-29 15:07:51 -04:00
Jiaming Yuan
5374f52531
Complete cudf support. (#4850)
* Handles missing value.
* Accept all floating point and integer types.
* Move to cudf 9.0 API.
* Remove requirement on `null_count`.
* Arbitrary column types support.
2019-09-16 23:52:00 -04:00
Rong Ou
38ab79f889 Make HostDeviceVector single gpu only (#4773)
* Make HostDeviceVector single gpu only
2019-08-26 09:51:13 +12:00
Jiaming Yuan
9700776597 Cudf support. (#4745)
* Initial support for cudf integration.

* Add two C APIs for consuming data and metainfo.

* Add CopyFrom for SimpleCSRSource as a generic function to consume the data.

* Add FromDeviceColumnar for consuming device data.

* Add new MetaInfo::SetInfo for consuming label, weight etc.
2019-08-19 16:51:40 +12:00