xgboost

Author	SHA1	Message	Date
Jiaming Yuan	142a208a90	Fix compiler warnings. (#8022 ) - Remove/fix unused parameters - Remove deprecated code in rabit. - Update dmlc-core.	2022-06-22 21:29:10 +08:00
Bobby Wang	e44a082620	[jvm-packages] update nccl version to 2.12.12-1 (#8015 )	2022-06-21 17:34:09 +08:00
Jiaming Yuan	4a87ea49b8	Reduce regularization for CPU gblinear. (#8013 )	2022-06-21 01:05:27 +08:00
Jiaming Yuan	d285d6ba2a	Reduce regularization in GPU gblinear test. (#8010 )	2022-06-20 23:55:12 +08:00
Jiaming Yuan	9b0eb66b78	Fix GPU driver test. (#8008 ) * Initialize the training parameter.	2022-06-20 19:37:31 +08:00
Jiaming Yuan	637e42a0c0	Use 22.04 for RMM. (#8001 ) 22.06 is not released yet.	2022-06-17 04:07:31 +08:00
Jiaming Yuan	8f8bd8147a	Fix LTR with weighted Quantile DMatrix. (#7975 ) * Fix LTR with weighted Quantile DMatrix. * Better tests.	2022-06-09 01:33:41 +08:00
Jiaming Yuan	1a33b50a0d	Fix compiler warnings. (#7974 ) - Remove unused parameters. There are still many warnings that are not yet addressed. Currently, the warnings in dmlc-core dominate the error log. - Remove `distributed` parameter from metric. - Fixes some warnings about signed comparison.	2022-06-06 22:56:25 +08:00
Jiaming Yuan	d48123d23b	Fix rmm build (#7973 ) - Optionally switch to c++17 - Use rmm CMake target. - Workaround compiler errors. - Fix GPUMetric inheritance. - Run death tests even if it's built with RMM support. Co-authored-by: jakirkham <jakirkham@gmail.com>	2022-06-06 20:18:32 +08:00
Jiaming Yuan	b90c6d25e8	Implement `max_cat_threshold` for CPU. (#7957 )	2022-06-04 11:02:46 +08:00
Jiaming Yuan	13b15e07e8	Handle formatted JSON input. (#7953 )	2022-06-01 16:20:58 +08:00
Rong Ou	80339c3427	Enable distributed GPU training over Rabit (#7930 )	2022-05-31 04:09:45 +08:00
Philip Hyunsu Cho	47224dd6d3	Use private mirror to host llvm-openmp tarballs (#7950 )	2022-05-27 14:56:59 -07:00
Jiaming Yuan	bde4f25794	Handle missing categorical value in CPU evaluator. (#7948 )	2022-05-27 14:15:47 +08:00
Philip Hyunsu Cho	2070afea02	[CI] Rotate package repository keys (#7943 )	2022-05-26 17:06:46 -07:00
Jiaming Yuan	18cbebaeb9	Unify the cat split storage for CPU. (#7937 ) * Unify the cat split storage for CPU. * Cleanup. * Workaround.	2022-05-26 04:14:40 -07:00
Jiaming Yuan	606be9e663	Handle missing values in one hot splits. (#7934 )	2022-05-24 20:48:41 +08:00
Jiaming Yuan	18a38f7ca0	Refactor for GHistIndex. (#7923 ) * Pass sparse page as adapter, which prepares for quantile dmatrix. * Remove old external memory code like `rbegin` and extra `Init` function. * Simplify type dispatch.	2022-05-23 23:04:53 +08:00
Jiaming Yuan	474366c020	Add convergence test for sparse datasets. (#7922 )	2022-05-23 18:07:26 +08:00
Jiaming Yuan	f93a727869	Address remaining mypy errors in python package. (#7914 )	2022-05-18 22:46:15 +08:00
Jiaming Yuan	765097d514	Simplify inplace-predict. (#7910 ) Pass the `X` as part of Proxy DMatrix instead of an independent `dmlc::any`.	2022-05-18 17:52:00 +08:00
Jiaming Yuan	19775ffe15	Use adapter to initialize column matrix. (#7912 )	2022-05-18 16:15:12 +08:00
Rory Mitchell	71d3b2e036	Fuse gpu_hist all-reduce calls where possible (#7867 )	2022-05-17 13:27:50 +02:00
Rong Ou	77d4a53c32	use RabitContext intead of init/finalize (#7911 )	2022-05-17 12:15:41 +08:00
Jiaming Yuan	4fcfd9c96e	Fix and cleanup for column matrix. (#7901 ) * Fix missed type dispatching for dense columns with missing values. * Code cleanup to reduce special cases. * Reduce memory usage.	2022-05-16 21:11:50 +08:00
Jiaming Yuan	1baad8650c	Small cleanup to Column. (#7898 ) * Define forward iterator to hide the internal state.	2022-05-15 12:39:10 +08:00
Jiaming Yuan	1b6538b4e5	[breaking] Drop single precision histogram (#7892 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-05-13 19:54:55 +08:00
Jiaming Yuan	11d65fcb21	Extract partial sum into an independent function. (#7889 )	2022-05-13 14:30:35 +08:00
Jiaming Yuan	db80671d6b	Fix monotone constraint with tuple input. (#7891 )	2022-05-13 04:00:03 +08:00
Jiaming Yuan	94ca52b7b7	Fix overflow in prediction size. (#7885 )	2022-05-12 02:44:03 +08:00
Rory Mitchell	7ef54e39ec	Small refactor to categoricals (#7858 )	2022-05-05 17:47:02 +02:00
Rong Ou	14ef38b834	Initial support for federated learning (#7831 ) Federated learning plugin for xgboost: * A gRPC server to aggregate MPI-style requests (allgather, allreduce, broadcast) from federated workers. * A Rabit engine for the federated environment. * Integration test to simulate federated learning. Additional followups are needed to address GPU support, better security, and privacy, etc.	2022-05-05 21:49:22 +08:00
Jiaming Yuan	46e0bce212	Use maximum category in sketch. (#7853 )	2022-05-05 19:56:49 +08:00
Jiaming Yuan	317d7be6ee	Always use partition based categorical splits. (#7857 )	2022-05-03 22:30:32 +08:00
Rory Mitchell	90cce38236	Remove single_precision_histogram for gpu_hist (#7828 )	2022-05-03 14:53:19 +02:00
Jiaming Yuan	50d854e02e	[CI] Test with latest RAPIDS. (#7816 )	2022-04-30 11:55:10 -07:00
Bobby Wang	1b103e1f5f	[CI] make container be able to re-attached (#7848 ) When re-starting the container, it will fail in entrypoint.sh which will exit when adding an existing group or user	2022-04-29 19:00:35 -07:00
Jiaming Yuan	fdf533f2b9	[POC] Experimental support for l1 error. (#7812 ) Support adaptive tree, a feature supported by both sklearn and lightgbm. The tree leaf is recomputed based on residue of labels and predictions after construction. For l1 error, the optimal value is the median (50 percentile). This is marked as experimental support for the following reasons: - The value is not well defined for distributed training, where we might have empty leaves for local workers. Right now I just use the original leaf value for computing the average with other workers, which might cause significant errors. - Some follow-ups are required, for exact, pruner, and optimization for quantile function. Also, we need to calculate the initial estimation.	2022-04-26 21:41:55 +08:00
Jiaming Yuan	332380479b	Avoid warning in np primitive type tests. (#7833 )	2022-04-23 02:07:01 +08:00
Jiaming Yuan	c70fa502a5	Expose `feature_types` to sklearn interface. (#7821 )	2022-04-21 20:23:35 +08:00
Jiaming Yuan	52d4eda786	Deprecate `use_label_encoder` in XGBClassifier. (#7822 ) * Deprecate `use_label_encoder` in XGBClassifier. * We have removed the encoder, now prepare to remove the indicator.	2022-04-21 13:14:02 +08:00
Jiaming Yuan	fd78af404b	Drop support for deprecated CUDA architectures. (#7774 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-03-31 21:42:23 +08:00
Philip Hyunsu Cho	e8eff3581b	[CI] Enable faulthandler to show details when 0xC0000005 error occurs (#7771 ) (#7775 )	2022-03-31 17:40:06 +08:00
Jiaming Yuan	9150fdbd4d	Support pandas nullable types. (#7760 )	2022-03-30 08:51:52 +08:00
Jiaming Yuan	a50b84244e	Cleanup configuration for constraints. (#7758 )	2022-03-29 04:22:46 +08:00
Jiaming Yuan	3c9b04460a	Move `num_parallel_tree` to model parameter. (#7751 ) The size of forest should be a property of model itself instead of a training hyper-parameter.	2022-03-29 02:32:42 +08:00
Jiaming Yuan	8b3ecfca25	Mitigate flaky tests. (#7749 ) * Skip non-increasing test with external memory when subsample is used. * Increase bin numbers for boost from prediction test. This mitigates the effect of non-deterministic partitioning.	2022-03-28 21:20:50 +08:00
Jiaming Yuan	64575591d8	Use context in `SetInfo`. (#7687 ) * Use the name `Context`. * Pass a context object into `SetInfo`. * Add context to proxy matrix. * Add context to iterative DMatrix. This is to remove the use of the default number of threads during `SetInfo` as a follow-up on removing the global omp variable while preparing for CUDA stream semantic. Currently, XGBoost uses the legacy CUDA stream, we will gradually remove them in the future in favor of non-blocking streams.	2022-03-24 22:16:26 +08:00
Jiaming Yuan	4d81c741e9	External memory support for hist (#7531 ) * Generate column matrix from gHistIndex. * Avoid synchronization with the sparse page once the cache is written. * Cleanups: Remove member variables/functions, change the update routine to look like approx and gpu_hist. * Remove pruner.	2022-03-22 00:13:20 +08:00
Jiaming Yuan	996cc705af	Small cleanup to hist tree method. (#7735 ) * Remove special optimization using number of bins. * Remove 1-based index for column sampling. * Remove data layout. * Unify update prediction cache.	2022-03-20 03:44:55 +08:00

... 3 4 5 6 7 ...

1247 Commits