Jiaming Yuan
4d665b3fb0
Restore clang tidy test. ( #8861 )
2023-03-03 13:47:04 -08:00
Rory Mitchell
69a50248b7
Fix scope of feature set pointers ( #8850 )
...
---------
Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>
2023-03-02 12:37:14 +08:00
mzzhang95
6cef9a08e9
[pyspark] Update eval_metric validation to support list of strings ( #8826 )
2023-03-02 08:24:12 +08:00
Rong Ou
7cbaee9916
Support column split in approx tree method ( #8847 )
2023-03-02 03:59:07 +08:00
Philip Hyunsu Cho
6d8afb2218
[CI] Require C++17 + CMake 3.18; Use CUDA 11.8 in CI ( #8853 )
...
* Update to C++17
* Turn off unity build
* Update CMake to 3.18
* Use MSVC 2022 + CUDA 11.8
* Re-create stack for worker images
* Allocate more disk space for Windows
* Tempiorarily disable clang-tidy
* RAPIDS now requires Python 3.10+
* Unpin cuda-python
* Use latest NCCL
* Use Ubuntu 20.04 in RMM image
* Mark failing mgpu test as xfail
2023-03-01 09:22:24 -08:00
Jiaming Yuan
d54ef56f6f
Fix cache with gc ( #8851 )
...
- Make DMatrixCache thread-safe.
- Remove the use of thread-local memory.
2023-03-01 00:39:06 +08:00
Rong Ou
d9688f93c7
Support column-split in row partitioner ( #8828 )
2023-02-26 04:43:35 +08:00
Rong Ou
a65ad0bd9c
Support column split in histogram builder ( #8811 )
2023-02-17 22:37:01 +08:00
Jiaming Yuan
c0afdb6786
Fix CPU bin compression with categorical data. ( #8809 )
...
* Fix CPU bin compression with categorical data.
* The bug causes the maximum category to be lesser than 256 or the maximum number of bins when
the input data is dense.
2023-02-16 04:20:34 +08:00
Jiaming Yuan
cce4af4acf
Initial support for quantile loss. ( #8750 )
...
- Add support for Python.
- Add objective.
2023-02-16 02:30:18 +08:00
Jiaming Yuan
282b1729da
Specify the number of threads for parallel sort. ( #8735 )
...
* Specify the number of threads for parallel sort.
- Pass context object into argsort.
- Replace macros with inline functions.
2023-02-16 00:20:19 +08:00
Jiaming Yuan
81b2ee1153
Pass DMatrix into metric for caching. ( #8790 )
2023-02-13 22:15:05 +08:00
Jiaming Yuan
31d3ec07af
Extract device algorithms. ( #8789 )
2023-02-13 20:53:53 +08:00
Jiaming Yuan
457f704e3d
Add quantile metric. ( #8761 )
2023-02-13 19:07:40 +08:00
Jiaming Yuan
d11a0044cf
Generalize prediction cache. ( #8783 )
...
* Extract most of the functionality into `DMatrixCache`.
* Move API entry to independent file to reduce dependency on `predictor.h` file.
* Add test.
---------
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
2023-02-13 12:36:43 +08:00
Rong Ou
ed91e775ec
Fix quantile tests running on multi-gpus ( #8775 )
...
* Fix quantile tests running on multi-gpus
* Run some gtests with multiple GPUs
* fix mgpu test naming
* Instruct NCCL to print extra logs
* Allocate extra space in /dev/shm to enable NCCL
* use gtest_skip to skip mgpu tests
---------
Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>
2023-02-12 17:00:26 -08:00
Jiaming Yuan
225b3158f6
Support custom metric in sklearn ranker. ( #8786 )
2023-02-12 13:14:07 +08:00
Jiaming Yuan
17b709acb9
Rename ranking utils to threading utils. ( #8785 )
2023-02-12 05:41:18 +08:00
Jiaming Yuan
8a16944664
Fix ranking with quantile dmatrix and group weight. ( #8762 )
2023-02-10 20:32:35 +08:00
Jiaming Yuan
199c421d60
Send default configuration from metric to objective. ( #8760 )
2023-02-09 20:18:07 +08:00
Jiaming Yuan
5f76edd296
Extract make metric name from ranking metric. ( #8768 )
...
- Extract the metric parsing routine from ranking.
- Add a test.
- Accept null for string view.
2023-02-09 18:30:21 +08:00
Jiaming Yuan
4ead65a28c
Increase timeout limit for linear. ( #8767 )
2023-02-09 18:20:12 +08:00
Rong Ou
cbf98cb9c6
Add Allgather to collective communicator ( #8765 )
...
* Add Allgather to collective communicator
2023-02-09 11:31:22 +08:00
Jiaming Yuan
48cefa012e
Support multiple alphas for segmented quantile. ( #8758 )
2023-02-07 17:17:59 +08:00
Jiaming Yuan
7b3d473593
[doc] Add demo for inference using individual tree. ( #8752 )
2023-02-07 04:40:18 +08:00
Jiaming Yuan
28bb01aa22
Extract optional weight. ( #8747 )
...
- Extract optional weight from coommon.h to reduce dependency on this header.
- Add test.
2023-02-07 03:11:53 +08:00
Jiaming Yuan
0f37a01dd9
Require black formatter for the python package. ( #8748 )
2023-02-07 01:53:33 +08:00
Jiaming Yuan
a2e433a089
Fix empty DMatrix with categorical features. ( #8739 )
2023-02-07 00:40:11 +08:00
Rory Mitchell
7214a45e83
Fix different number of features in gpu_hist evaluator. ( #8754 )
2023-02-06 23:15:16 +08:00
Rong Ou
66191e9926
Support cpu quantile sketch with column-wise data split ( #8742 )
2023-02-05 14:26:24 +08:00
Jiaming Yuan
c1786849e3
Use array interface for CSC matrix. ( #8672 )
...
* Use array interface for CSC matrix.
Use array interface for CSC matrix and align the interface with CSR and dense.
- Fix nthread issue in the R package DMatrix.
- Unify the behavior of handling `missing` with other inputs.
- Unify the behavior of handling `missing` around R, Python, Java, and Scala DMatrix.
- Expose `num_non_missing` to the JVM interface.
- Deprecate old CSR and CSC constructors.
2023-02-05 01:59:46 +08:00
BenEfrati
213b5602d9
Add sample_weight to eval_metric ( #8706 )
2023-02-05 00:06:38 +08:00
Philip Hyunsu Cho
dd79ab846f
[CI] Fix failing arm build ( #8751 )
...
* Always install Conda env into /opt/python; use Mamba
* Change ownership of Conda env to buildkite-agent user
* Use unique name
* Fix
2023-02-03 22:32:48 -08:00
Jiaming Yuan
0e61ba57d6
Fix GPU L1 error. ( #8749 )
2023-02-04 03:02:00 +08:00
James Lamb
0d8248ddcd
[R] discourage use of regex for fixed string comparisons ( #8736 )
2023-01-30 18:47:21 +08:00
Jiaming Yuan
1325ba9251
Support primitive types of pyarrow-backed pandas dataframe. ( #8653 )
...
Categorical data (dictionary) is not supported at the moment.
2023-01-30 17:53:29 +08:00
Jiaming Yuan
3760cede0f
Consistent use of context to specify number of threads. ( #8733 )
...
- Use context in all tests.
- Use context in R.
- Use context in C API DMatrix initialization. (0 threads is used as dft).
2023-01-30 15:25:31 +08:00
Rong Ou
8af98e30fc
Use in-memory communicator to test quantile ( #8710 )
2023-01-27 23:28:28 +08:00
James Lamb
96e6b6beba
[ci] remove unused imports in tests ( #8707 )
2023-01-25 14:10:29 +08:00
Jiaming Yuan
9fb12b20a4
Cleanup the callback module. ( #8702 )
...
- Cleanup pylint markers.
- run formatter.
- Update examples of using callback.
2023-01-22 00:13:49 +08:00
Jiaming Yuan
34eee56256
Fix compiler warnings. ( #8703 )
...
Fix warnings about signed/unsigned comparisons.
2023-01-21 15:16:23 +08:00
Jiaming Yuan
e49e0998c0
Extract CPU sampling routines. ( #8697 )
2023-01-19 23:28:18 +08:00
Jiaming Yuan
4416452f94
Return single thread from context when called inside omp region. ( #8693 )
2023-01-18 09:23:37 +08:00
Jiaming Yuan
31b9cbab3d
Make sure input numpy array is aligned. ( #8690 )
...
- use `np.require` to specify that the alignment is required.
- scipy csr as well.
- validate input pointer in `ArrayInterface`.
2023-01-18 08:12:13 +08:00
Rong Ou
78396f8a6e
Initial support for column-split cpu predictor ( #8676 )
2023-01-18 06:33:13 +08:00
Jiaming Yuan
247946a875
Cache transformed in QuantileDMatrix for efficiency. ( #8666 )
2023-01-17 06:02:40 +08:00
Jiaming Yuan
43152657d4
Extract JSON type check. ( #8677 )
...
- Reuse it in `GetMissing`.
- Add test.
2023-01-17 03:11:07 +08:00
Jiaming Yuan
d6018eb4b9
Remove all use of DeviceQuantileDMatrix. ( #8665 )
2023-01-17 00:04:10 +08:00
Jiaming Yuan
07cf3d3e53
Fix threads in DMatrix slice. ( #8667 )
2023-01-14 07:16:57 +08:00
Jiaming Yuan
e27cda7626
[CI] Skip pyspark sparse tests. ( #8675 )
2023-01-14 05:37:00 +08:00