xgboost

Author	SHA1	Message	Date
Jiaming Yuan	3e26107a9c	Rename and extract `Context`. (#8528 ) * Rename `GenericParameter` to `Context`. * Rename header file to reflect the change. * Rename all references.	2022-12-07 04:58:54 +08:00
Jiaming Yuan	e3bf5565ab	Extract transform iterator. (#8498 )	2022-12-05 21:37:07 +08:00
Robert Maynard	16f96b6cfb	Work with newer thrust and libcudacxx (#8454 ) * Thrust 1.17 removes the experimental/pinned_allocator. When xgboost is brought into a large project it can be compiled against Thrust 1.17+ which don't offer this experimental allocator. To ensure that going forward xgboost works in all environments we provide a xgboost namespaced version of the pinned_allocator that previously was in Thrust.	2022-11-11 04:22:53 +08:00
Dmitry Razdoburdin	5bd849f1b5	Unify the partitioner for hist and approx. Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com> Co-authored-by: jiamingy <jm.yuan@outlook.com>	2022-10-20 02:49:20 +08:00
Jiaming Yuan	031d66ec27	Configuration for init estimation. (#8343 ) * Configuration for init estimation. * Check whether the model needs configuration based on const attribute `ModelFitted` instead of a mutable state. * Add parameter `boost_from_average` to tell whether the user has specified base score. * Add tests.	2022-10-18 01:52:24 +08:00
Jiaming Yuan	3ef1703553	Allow using string view to find JSON value. (#8332 ) - Allow comparison between string and string view. - Fix compiler warnings.	2022-10-13 17:10:13 +08:00
Philip Hyunsu Cho	bc7a6ec603	Fix clang tidy (#8314 ) * Fix clang-tidy * Exempt clang-tidy from budget check * Move clang-tidy	2022-10-06 05:16:06 -08:00
Dmitry Razdoburdin	c24e9d712c	Dispatcher for template parameters of BuildHist Kernels (#8259 ) * Intoducing Column Wise Hist Building * linting * more linting * bug fixing * Removing column samping optimization for a while to simplify the review process. * linting * Removing unnecessary changes * Use DispatchBinType in hist_util.cc * Adding force_read_by column flag to buildhist. Adding tests for column wise buiilhist. * Introducing new dispatcher for compile time flags in hist building * fixing bug with using of DispatchBinType * Fixing building * Merging with master branch Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com> Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2022-10-06 03:02:29 -08:00
Rong Ou	668b8a0ea4	[Breaking] Switch from rabit to the collective communicator (#8257 ) * Switch from rabit to the collective communicator * fix size_t specialization * really fix size_t * try again * add include * more include * fix lint errors * remove rabit includes * fix pylint error * return dict from communicator context * fix communicator shutdown * fix dask test * reset communicator mocklist * fix distributed tests * do not save device communicator * fix jvm gpu tests * add python test for federated communicator * Update gputreeshap submodule Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-05 14:39:01 -08:00
Rory Mitchell	d686bf52a6	Reduce time for some multi-gpu tests (#8288 ) * Faster dask tests * Reuse AllReducer objects in tests. * Faster boost from prediction tests. * Use rmm dask fixture. * Speed up dask demo. * mypy * Format with black. * mypy * Clang-tidy Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-10-04 02:49:33 -08:00
Philip Hyunsu Cho	ca0547bb65	[CI] Use RAPIDS 22.10 (#8298 ) * [CI] Use RAPIDS 22.10 * Store CUDA and RAPIDS versions in one place * Fix * Add missing #include * Update gputreeshap submodule * Fix * Remove outdated distributed tests	2022-10-03 23:18:07 -08:00
Jiaming Yuan	55cf24cc32	Obtain CSR matrix from DMatrix. (#8269 )	2022-09-29 20:41:43 +08:00
Jiaming Yuan	6d1452074a	Remove MGPU cpp tests. (#8276 ) Co-authored-by: Hyunsu Philip Cho <chohyu01@cs.washington.edu>	2022-09-27 21:18:23 +08:00
Rory Mitchell	8f77677193	Use quantised gradients in gpu_hist histograms (#8246 )	2022-09-26 17:35:35 +02:00
Dmitry Razdoburdin	eb7bbee2c9	Optional by-column histogram build. (#8233 ) Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com>	2022-09-22 05:16:13 +08:00
Jiaming Yuan	fffb1fca52	Calculate `base_score` based on input labels for mae. (#8107 ) Fit an intercept as base score for abs loss.	2022-09-20 20:53:54 +08:00
Rong Ou	a2686543a9	Common interface for collective communication (#8057 ) * implement broadcast for federated communicator * implement allreduce * add communicator factory * add device adapter * add device communicator to factory * add rabit communicator * add rabit communicator to the factory * add nccl device communicator * add synchronize to device communicator * add back print and getprocessorname * add python wrapper and c api * clean up types * fix non-gpu build * try to fix ci * fix std::size_t * portable string compare ignore case * c style size_t * fix lint errors * cross platform setenv * fix memory leak * fix lint errors * address review feedback * add python test for rabit communicator * fix failing gtest * use json to configure communicators * fix lint error * get rid of factories * fix cpu build * fix include * fix python import * don't export collective.py yet * skip collective communicator pytest on windows * add review feedback * update documentation * remove mpi communicator type * fix tests * shutdown the communicator separately Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu>	2022-09-12 15:21:12 -07:00
Jiaming Yuan	441ffc017a	Copy data from Ellpack to GHist. (#8215 )	2022-09-06 23:05:49 +08:00
Dmitry Razdoburdin	deae99e662	Optimization/buildhist/hist util (#8218 ) * BuildHistKernel optimization Co-authored-by: dmitry.razdoburdin <drazdobu@jfldaal005.jf.intel.com>	2022-09-02 19:39:45 +08:00
Jiaming Yuan	16bca5d4a1	Support CPU input for device `QuantileDMatrix`. (#8136 ) - Copy `GHistIndexMatrix` to `Ellpack` when needed.	2022-08-11 21:21:26 +08:00
Jiaming Yuan	bcc8679a05	Update CUDA docker image and NCCL. (#8139 )	2022-08-07 16:32:41 +08:00
Jiaming Yuan	7785d65c8a	Fix feature weights with multiple column sampling. (#8100 )	2022-07-22 20:23:05 +08:00
Rory Mitchell	1be09848a7	Refactor split valuation kernel (#8073 )	2022-07-21 15:41:50 +02:00
Tim Gates	cb40bbdadd	docs: fix simple typo, cannonical -> canonical (#8099 ) There is a small typo in src/common/partition_builder.h. Should read `canonical` rather than `cannonical`. Signed-off-by: Tim Gates <tim.gates@iress.com>	2022-07-20 21:04:50 +08:00
QuellaZhang	703261e78f	[MSVC][std:c++latest] Fix compiler error (#8093 ) Co-authored-by: QuellaZhang <zhangyi2090@163.com>	2022-07-20 15:15:39 +08:00
Jiaming Yuan	4083440690	Small cleanups to various data types. (#8086 ) - Use `bst_bin_t` in batch param constructor. - Use `StringView` to avoid `std::string` when appropriate. - Avoid using `MetaInfo` in quantile constructor to limit the scope of parameter.	2022-07-18 22:39:36 +08:00
Jiaming Yuan	8dd96013f1	Split up column matrix initialization. (#8060 ) * Split up column matrix initialization. This PR splits the column matrix initialization into 2 steps, the first one initializes the storage while the second one does the transpose. By doing so, we can reuse the code for Quantile DMatrix.	2022-07-14 10:34:47 +08:00
Rory Mitchell	bc4f802b17	Batch UpdatePosition using cudaMemcpy (#7964 )	2022-06-30 17:52:40 +02:00
Jiaming Yuan	f0c1b842bf	Implement sketching with adapter. (#8019 )	2022-06-23 00:03:02 +08:00
Jiaming Yuan	142a208a90	Fix compiler warnings. (#8022 ) - Remove/fix unused parameters - Remove deprecated code in rabit. - Update dmlc-core.	2022-06-22 21:29:10 +08:00
Jiaming Yuan	8f8bd8147a	Fix LTR with weighted Quantile DMatrix. (#7975 ) * Fix LTR with weighted Quantile DMatrix. * Better tests.	2022-06-09 01:33:41 +08:00
Jiaming Yuan	1a33b50a0d	Fix compiler warnings. (#7974 ) - Remove unused parameters. There are still many warnings that are not yet addressed. Currently, the warnings in dmlc-core dominate the error log. - Remove `distributed` parameter from metric. - Fixes some warnings about signed comparison.	2022-06-06 22:56:25 +08:00
Jiaming Yuan	b90c6d25e8	Implement `max_cat_threshold` for CPU. (#7957 )	2022-06-04 11:02:46 +08:00
Rong Ou	80339c3427	Enable distributed GPU training over Rabit (#7930 )	2022-05-31 04:09:45 +08:00
Gyeongjae Choi	cc6d57aa0d	Add minimal emscripten build support (#7954 )	2022-05-30 14:11:40 +08:00
Jiaming Yuan	18a38f7ca0	Refactor for GHistIndex. (#7923 ) * Pass sparse page as adapter, which prepares for quantile dmatrix. * Remove old external memory code like `rbegin` and extra `Init` function. * Simplify type dispatch.	2022-05-23 23:04:53 +08:00
Jiaming Yuan	19775ffe15	Use adapter to initialize column matrix. (#7912 )	2022-05-18 16:15:12 +08:00
Jiaming Yuan	4fcfd9c96e	Fix and cleanup for column matrix. (#7901 ) * Fix missed type dispatching for dense columns with missing values. * Code cleanup to reduce special cases. * Reduce memory usage.	2022-05-16 21:11:50 +08:00
Jiaming Yuan	1baad8650c	Small cleanup to Column. (#7898 ) * Define forward iterator to hide the internal state.	2022-05-15 12:39:10 +08:00
Jiaming Yuan	1b6538b4e5	[breaking] Drop single precision histogram (#7892 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-05-13 19:54:55 +08:00
Jiaming Yuan	11d65fcb21	Extract partial sum into an independent function. (#7889 )	2022-05-13 14:30:35 +08:00
Jiaming Yuan	46e0bce212	Use maximum category in sketch. (#7853 )	2022-05-05 19:56:49 +08:00
Jiaming Yuan	317d7be6ee	Always use partition based categorical splits. (#7857 )	2022-05-03 22:30:32 +08:00
Jiaming Yuan	288c52596c	Define bin type. (#7850 )	2022-04-29 19:41:39 +08:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Jiaming Yuan	6fa1afdffc	Avoid compiler warning about comparison. (#7768 )	2022-03-31 08:52:14 +08:00
Jiaming Yuan	d4796482b5	Fix failures on R hub and Win builder. (#7763 ) * Update date. * Workaround amalgamation build with clang. (SimpleDMatrix instantiation) * Workaround compiler error with driver push. * Revert autoconf requirement. * Fix model IO on 32-bit environment. (i386) * Clarify the function name.	2022-03-30 07:14:33 +08:00
Jiaming Yuan	3c9b04460a	Move `num_parallel_tree` to model parameter. (#7751 ) The size of forest should be a property of model itself instead of a training hyper-parameter.	2022-03-29 02:32:42 +08:00
Jiaming Yuan	4d81c741e9	External memory support for hist (#7531 ) * Generate column matrix from gHistIndex. * Avoid synchronization with the sparse page once the cache is written. * Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist. * Remove pruner.	2022-03-22 00:13:20 +08:00
Jiaming Yuan	996cc705af	Small cleanup to hist tree method. (#7735 ) * Remove special optimization using number of bins. * Remove 1-based index for column sampling. * Remove data layout. * Unify update prediction cache.	2022-03-20 03:44:55 +08:00

1 2 3 4 5 ...

413 Commits