xgboost

Author	SHA1	Message	Date
Philip Hyunsu Cho	afd03a6934	Fix build for AppleClang 11 (#9684 )	2023-10-18 09:35:59 -07:00
Jiaming Yuan	f05a23b41c	Use `weakref` instead of `id` for `DataIter` cache. (#9445 ) - Fix case where Python reuses id from freed objects. - Small optimization to column matrix with QDM by using `realloc` instead of copying data.	2023-08-10 00:40:06 +08:00
Jiaming Yuan	54029a59af	Bound the size of the histogram cache. (#9440 ) - A new histogram collection with a limit in size. - Unify histogram building logic between hist, multi-hist, and approx.	2023-08-08 03:21:26 +08:00
Rong Ou	bde1ebc209	Switch back to the GPUIDX macro (#9438 )	2023-08-04 15:14:31 +08:00
Rong Ou	c2b85ab68a	Clean up MGPU C++ tests (#9430 )	2023-08-02 14:31:18 +08:00
Jiaming Yuan	a196443a07	Implement sketching with Hessian on GPU. (#9399 ) - Prepare for implementing approx on GPU. - Unify the code path between weighted and uniform sketching on DMatrix.	2023-07-24 15:43:03 +08:00
Jiaming Yuan	04aff3af8e	Define the new `device` parameter. (#9362 )	2023-07-13 19:30:25 +08:00
Rong Ou	3632242e0b	Support column split with GPU quantile (#9370 )	2023-07-11 12:15:56 +08:00
Jiaming Yuan	d0916849a6	Remove unused weight from buffer for cat features. (#9341 )	2023-07-04 01:07:09 +08:00
Jiaming Yuan	39390cc2ee	[breaking] Remove the `predictor` param, allow fallback to prediction using `DMatrix`. (#9129 ) - A `DeviceOrd` struct is implemented to indicate the device. It will eventually replace the `gpu_id` parameter. - The `predictor` parameter is removed. - Fallback to `DMatrix` when `inplace_predict` is not available. - The heuristic for choosing a predictor is only used during training.	2023-07-03 19:23:54 +08:00
Jiaming Yuan	bc267dd729	Use ptr from `mmap` for `GHistIndexMatrix` and `ColumnMatrix`. (#9315 ) * Use ptr from mmap for `GHistIndexMatrix` and `ColumnMatrix`. - Define a resource for holding various types of memory pointers. - Define ref vector for holding resources. - Swap the underlying resources for GHist and ColumnM. - Add documentation for current status. - s390x support is removed. It should work if you can compile XGBoost, all the old workaround code does is to get GCC to compile.	2023-06-27 19:05:46 +08:00
Jiaming Yuan	54da4b3185	Cleanup to prepare for using mmap pointer in external memory. (#9317 ) - Update SparseDMatrix comment. - Use a pointer in the bitfield. We will replace the `std::vector<bool>` in `ColumnMatrix` with bitfield. - Clean up the page source. The timer is removed as it's inaccurate once we swap the mmap pointer into the page.	2023-06-22 06:43:11 +08:00
Jiaming Yuan	ee6809e642	Use mmap for external memory. (#9282 ) - Have basic infrastructure for mmap. - Release file write handle.	2023-06-19 18:52:55 +08:00
Rong Ou	e70810be8a	Refactor device communicator to make allreduce more flexible (#9295 )	2023-06-14 03:53:03 +08:00
Jiaming Yuan	152e2fb072	Unify test helpers for creating ctx. (#9274 )	2023-06-10 03:35:22 +08:00
Jiaming Yuan	17fd3f55e9	Optimize adapter element counting on GPU. (#9209 ) - Implement a simple `IterSpan` for passing iterators with size. - Use shared memory for column size counts. - Use one thread for each sample in row count to reduce atomic operations.	2023-05-30 23:28:43 +08:00
Jiaming Yuan	08ce495b5d	Use Booster context in DMatrix. (#8896 ) - Pass context from booster to DMatrix. - Use context instead of integer for `n_threads`. - Check the consistency configuration for `max_bin`. - Test for all combinations of initialization options.	2023-04-28 21:47:14 +08:00
Jiaming Yuan	1f9a57d17b	[Breaking] Require format to be specified in input URI. (#9077 ) Previously, we use `libsvm` as default when format is not specified. However, the dmlc data parser is not particularly robust against errors, and the most common type of error is undefined format. Along with which, we will recommend users to use other data loader instead. We will continue the maintenance of the parsers as it's currently used for many internal tests including federated learning.	2023-04-28 19:45:15 +08:00
Rong Ou	a320b402a5	More refactoring to take advantage of collective aggregators (#9081 )	2023-04-26 03:36:09 +08:00
Jiaming Yuan	acc110c251	[MT-TREE] Support prediction cache and model slicing. (#8968 ) - Fix prediction range. - Support prediction cache in mt-hist. - Support model slicing. - Make the booster a Python iterable by defining `__iter__`. - Cleanup removed/deprecated parameters. - A new field in the output model `iteration_indptr` for pointing to the ranges of trees for each iteration.	2023-03-27 23:10:54 +08:00
Jiaming Yuan	5891f752c8	Rework the MAP metric. (#8931 ) - The new implementation is more strict as only binary labels are accepted. The previous implementation converts values greater than 1 to 1. - Deterministic GPU. (no atomic add). - Fix top-k handling. - Precise definition of MAP. (There are other variants on how to handle top-k). - Refactor GPU ranking tests.	2023-03-22 17:45:20 +08:00
Jiaming Yuan	a093770f36	Partitioner for multi-target tree. (#8922 )	2023-03-16 18:49:34 +08:00
Jiaming Yuan	26209a42a5	Define git attributes for renormalization. (#8921 )	2023-03-16 02:43:11 +08:00
Jiaming Yuan	8be6095ece	Implement NDCG cache. (#8893 )	2023-03-13 22:16:31 +08:00
Jiaming Yuan	46dfcc7d22	Define a new ranking parameter. (#8887 )	2023-03-09 17:46:24 +08:00
Jiaming Yuan	f236640427	Support F order for the tensor type. (#8872 ) - Add F order support for tensor and view. - Use parameter pack for automatic type cast. (avoid excessive static cast for shape).	2023-03-08 03:27:49 +08:00
Jiaming Yuan	4d665b3fb0	Restore clang tidy test. (#8861 )	2023-03-03 13:47:04 -08:00
Jiaming Yuan	cce4af4acf	Initial support for quantile loss. (#8750 ) - Add support for Python. - Add objective.	2023-02-16 02:30:18 +08:00
Jiaming Yuan	282b1729da	Specify the number of threads for parallel sort. (#8735 ) * Specify the number of threads for parallel sort. - Pass context object into argsort. - Replace macros with inline functions.	2023-02-16 00:20:19 +08:00
Jiaming Yuan	31d3ec07af	Extract device algorithms. (#8789 )	2023-02-13 20:53:53 +08:00
Jiaming Yuan	457f704e3d	Add quantile metric. (#8761 )	2023-02-13 19:07:40 +08:00
Rong Ou	ed91e775ec	Fix quantile tests running on multi-gpus (#8775 ) * Fix quantile tests running on multi-gpus * Run some gtests with multiple GPUs * fix mgpu test naming * Instruct NCCL to print extra logs * Allocate extra space in /dev/shm to enable NCCL * use gtest_skip to skip mgpu tests --------- Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2023-02-12 17:00:26 -08:00
Jiaming Yuan	17b709acb9	Rename ranking utils to threading utils. (#8785 )	2023-02-12 05:41:18 +08:00
Jiaming Yuan	5f76edd296	Extract make metric name from ranking metric. (#8768 ) - Extract the metric parsing routine from ranking. - Add a test. - Accept null for string view.	2023-02-09 18:30:21 +08:00
Jiaming Yuan	48cefa012e	Support multiple alphas for segmented quantile. (#8758 )	2023-02-07 17:17:59 +08:00
Jiaming Yuan	28bb01aa22	Extract optional weight. (#8747 ) - Extract optional weight from coommon.h to reduce dependency on this header. - Add test.	2023-02-07 03:11:53 +08:00
Rong Ou	66191e9926	Support cpu quantile sketch with column-wise data split (#8742 )	2023-02-05 14:26:24 +08:00
Jiaming Yuan	c1786849e3	Use array interface for CSC matrix. (#8672 ) * Use array interface for CSC matrix. Use array interface for CSC matrix and align the interface with CSR and dense. - Fix nthread issue in the R package DMatrix. - Unify the behavior of handling `missing` with other inputs. - Unify the behavior of handling `missing` around R, Python, Java, and Scala DMatrix. - Expose `num_non_missing` to the JVM interface. - Deprecate old CSR and CSC constructors.	2023-02-05 01:59:46 +08:00
Jiaming Yuan	3760cede0f	Consistent use of context to specify number of threads. (#8733 ) - Use context in all tests. - Use context in R. - Use context in C API DMatrix initialization. (0 threads is used as dft).	2023-01-30 15:25:31 +08:00
Rong Ou	8af98e30fc	Use in-memory communicator to test quantile (#8710 )	2023-01-27 23:28:28 +08:00
Jiaming Yuan	4416452f94	Return single thread from context when called inside omp region. (#8693 )	2023-01-18 09:23:37 +08:00
Jiaming Yuan	43152657d4	Extract JSON type check. (#8677 ) - Reuse it in `GetMissing`. - Add test.	2023-01-17 03:11:07 +08:00
Jiaming Yuan	cfa994d57f	Multi-target support for L1 error. (#8652 ) - Add matrix support to the median function. - Iterate through each target for quantile computation.	2023-01-11 05:51:14 +08:00
Jiaming Yuan	8d545ab2a2	Implement fit stump. (#8607 )	2023-01-04 04:14:51 +08:00
Rong Ou	3ceeb8c61c	Add data split mode to DMatrix MetaInfo (#8568 )	2022-12-25 20:37:37 +08:00
Jiaming Yuan	38887a1876	Fix windows build on buildkite. (#8602 )	2022-12-16 21:12:24 +08:00
Jiaming Yuan	43a647a4dd	Fix inference with categorical feature. (#8591 )	2022-12-15 17:57:26 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Jiaming Yuan	e3bf5565ab	Extract transform iterator. (#8498 )	2022-12-05 21:37:07 +08:00
Rong Ou	8e76f5f595	Use `DataSplitMode` to configure data loading (#8434 ) * Use `DataSplitMode` to configure data loading	2022-11-08 16:21:50 +08:00

1 2 3 4 5 ...

251 Commits