xgboost

Author	SHA1	Message	Date
Jiaming Yuan	594371e35b	Fix CPP lint. (#8807 )	2023-02-15 20:16:35 +08:00
Jiaming Yuan	31d3ec07af	Extract device algorithms. (#8789 )	2023-02-13 20:53:53 +08:00
Jiaming Yuan	d11a0044cf	Generalize prediction cache. (#8783 ) * Extract most of the functionality into `DMatrixCache`. * Move API entry to independent file to reduce dependency on `predictor.h` file. * Add test. --------- Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2023-02-13 12:36:43 +08:00
Jiaming Yuan	8a16944664	Fix ranking with quantile dmatrix and group weight. (#8762 )	2023-02-10 20:32:35 +08:00
Jiaming Yuan	a2e433a089	Fix empty DMatrix with categorical features. (#8739 )	2023-02-07 00:40:11 +08:00
Rong Ou	66191e9926	Support cpu quantile sketch with column-wise data split (#8742 )	2023-02-05 14:26:24 +08:00
Jiaming Yuan	c1786849e3	Use array interface for CSC matrix. (#8672 ) * Use array interface for CSC matrix. Use array interface for CSC matrix and align the interface with CSR and dense. - Fix nthread issue in the R package DMatrix. - Unify the behavior of handling `missing` with other inputs. - Unify the behavior of handling `missing` around R, Python, Java, and Scala DMatrix. - Expose `num_non_missing` to the JVM interface. - Deprecate old CSR and CSC constructors.	2023-02-05 01:59:46 +08:00
Jiaming Yuan	3760cede0f	Consistent use of context to specify number of threads. (#8733 ) - Use context in all tests. - Use context in R. - Use context in C API DMatrix initialization. (0 threads is used as dft).	2023-01-30 15:25:31 +08:00
Jiaming Yuan	7a068af1a3	Workaround CUDA warning. (#8696 )	2023-01-19 09:16:08 +08:00
Jiaming Yuan	31b9cbab3d	Make sure input numpy array is aligned. (#8690 ) - use `np.require` to specify that the alignment is required. - scipy csr as well. - validate input pointer in `ArrayInterface`.	2023-01-18 08:12:13 +08:00
Jiaming Yuan	07cf3d3e53	Fix threads in DMatrix slice. (#8667 )	2023-01-14 07:16:57 +08:00
Jiaming Yuan	badeff1d74	Init estimation for regression. (#8272 )	2023-01-11 02:04:56 +08:00
Rong Ou	3ceeb8c61c	Add data split mode to DMatrix MetaInfo (#8568 )	2022-12-25 20:37:37 +08:00
Jiaming Yuan	c6a8754c62	Define CUDA Context. (#8604 ) We will transition to non-default and non-blocking CUDA stream.	2022-12-20 15:15:07 +08:00
Rong Ou	15a88ceef0	Fix deprecated CUB calls in CUDA 12.0 (#8578 )	2022-12-12 17:02:30 +08:00
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Jiaming Yuan	e3bf5565ab	Extract transform iterator. (#8498 )	2022-12-05 21:37:07 +08:00
Rong Ou	78d65a1928	Initial support for column-wise data split (#8468 )	2022-12-04 01:37:51 +08:00
Jiaming Yuan	157e98edf7	Support half type from cupy. (#8487 )	2022-11-30 17:56:42 +08:00
Jiaming Yuan	addaa63732	Support null value in CUDA array interface. (#8486 ) * Support null value in CUDA array interface. - Fix for potential null value in array interface. - Fix incorrect check on mask stride. * Simple tests. * Extract mask.	2022-11-28 17:48:25 -08:00
Jiaming Yuan	3fc1046fd3	Reduce compiler warnings on CPU-only build. (#8483 )	2022-11-29 00:04:16 +08:00
Jiaming Yuan	e07245f110	Take datatable as row major input. (#8472 ) * Take datatable as row major input. Try to avoid a transform with dense table.	2022-11-24 09:20:13 +08:00
Rong Ou	8e76f5f595	Use `DataSplitMode` to configure data loading (#8434 ) * Use `DataSplitMode` to configure data loading	2022-11-08 16:21:50 +08:00
Jiaming Yuan	3ef1703553	Allow using string view to find JSON value. (#8332 ) - Allow comparison between string and string view. - Fix compiler warnings.	2022-10-13 17:10:13 +08:00
Rong Ou	668b8a0ea4	[Breaking] Switch from rabit to the collective communicator (#8257 ) * Switch from rabit to the collective communicator * fix size_t specialization * really fix size_t * try again * add include * more include * fix lint errors * remove rabit includes * fix pylint error * return dict from communicator context * fix communicator shutdown * fix dask test * reset communicator mocklist * fix distributed tests * do not save device communicator * fix jvm gpu tests * add python test for federated communicator * Update gputreeshap submodule Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-05 14:39:01 -08:00
Jiaming Yuan	97c3a80a34	Add C document to sphinx, fix arrow. (#8300 ) - Group C API. - Add C API sphinx doc. - Consistent use of `OptionalArg` and the parameter name `config`. - Remove call to deprecated functions in demo. - Fix some formatting errors. - Add links to c examples in the document (only visible with doxygen pages) - Fix arrow.	2022-10-05 09:52:15 +08:00
Jiaming Yuan	55cf24cc32	Obtain CSR matrix from DMatrix. (#8269 )	2022-09-29 20:41:43 +08:00
Jiaming Yuan	4056974e37	Fix sparse threshold warning. (#8268 )	2022-09-26 22:22:11 +08:00
Jiaming Yuan	3fd331f8f2	Add checks to C pointer arguments. (#8254 )	2022-09-22 19:02:22 +08:00
Jiaming Yuan	fffb1fca52	Calculate `base_score` based on input labels for mae. (#8107 ) Fit an intercept as base score for abs loss.	2022-09-20 20:53:54 +08:00
Jiaming Yuan	bdf265076d	Make `QuantileDMatrix` default to sklearn esitmators. (#8220 )	2022-09-13 13:52:19 +08:00
Jiaming Yuan	441ffc017a	Copy data from Ellpack to GHist. (#8215 )	2022-09-06 23:05:49 +08:00
Jiaming Yuan	16bca5d4a1	Support CPU input for device `QuantileDMatrix`. (#8136 ) - Copy `GHistIndexMatrix` to `Ellpack` when needed.	2022-08-11 21:21:26 +08:00
Jiaming Yuan	446d536c23	Fix loading DMatrix binary in distributed env. (#8149 ) - Try to load DMatrix binary before trying to parse text input. - Remove some unmaintained code.	2022-08-10 22:53:16 +08:00
Jiaming Yuan	2c70751d1e	Implement iterative DMatrix for CPU. (#8116 )	2022-07-26 22:34:21 +08:00
Jiaming Yuan	4a4e5c7c18	Prepare gradient index for Quantile DMatrix. (#8103 ) * Prepare gradient index for Quantile DMatrix. - Implement push batch with adapter batch. - Implement `GetFvalue` for prediction.	2022-07-22 17:26:33 +08:00
Jiaming Yuan	4083440690	Small cleanups to various data types. (#8086 ) - Use `bst_bin_t` in batch param constructor. - Use `StringView` to avoid `std::string` when appropriate. - Avoid using `MetaInfo` in quantile constructor to limit the scope of parameter.	2022-07-18 22:39:36 +08:00
Jiaming Yuan	8dd96013f1	Split up column matrix initialization. (#8060 ) * Split up column matrix initialization. This PR splits the column matrix initialization into 2 steps, the first one initializes the storage while the second one does the transpose. By doing so, we can reuse the code for Quantile DMatrix.	2022-07-14 10:34:47 +08:00
Jiaming Yuan	8746f9cddf	Rename `IterativeDMatrix`. (#8045 )	2022-07-04 18:52:31 +08:00
Jiaming Yuan	f0c1b842bf	Implement sketching with adapter. (#8019 )	2022-06-23 00:03:02 +08:00
Jiaming Yuan	142a208a90	Fix compiler warnings. (#8022 ) - Remove/fix unused parameters - Remove deprecated code in rabit. - Update dmlc-core.	2022-06-22 21:29:10 +08:00
Jiaming Yuan	1a33b50a0d	Fix compiler warnings. (#7974 ) - Remove unused parameters. There are still many warnings that are not yet addressed. Currently, the warnings in dmlc-core dominate the error log. - Remove `distributed` parameter from metric. - Fixes some warnings about signed comparison.	2022-06-06 22:56:25 +08:00
Jiaming Yuan	d48123d23b	Fix rmm build (#7973 ) - Optionally switch to c++17 - Use rmm CMake target. - Workaround compiler errors. - Fix GPUMetric inheritance. - Run death tests even if it's built with RMM support. Co-authored-by: jakirkham <jakirkham@gmail.com>	2022-06-06 20:18:32 +08:00
Jiaming Yuan	18a38f7ca0	Refactor for GHistIndex. (#7923 ) * Pass sparse page as adapter, which prepares for quantile dmatrix. * Remove old external memory code like `rbegin` and extra `Init` function. * Simplify type dispatch.	2022-05-23 23:04:53 +08:00
Jiaming Yuan	765097d514	Simplify inplace-predict. (#7910 ) Pass the `X` as part of Proxy DMatrix instead of an independent `dmlc::any`.	2022-05-18 17:52:00 +08:00
Jiaming Yuan	19775ffe15	Use adapter to initialize column matrix. (#7912 )	2022-05-18 16:15:12 +08:00
Jiaming Yuan	11d65fcb21	Extract partial sum into an independent function. (#7889 )	2022-05-13 14:30:35 +08:00
Jiaming Yuan	288c52596c	Define bin type. (#7850 )	2022-04-29 19:41:39 +08:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Jiaming Yuan	522636cb52	Bump version. (#7769 )	2022-03-31 06:33:22 +08:00

1 2 3 4 5 ...

331 Commits