xgboost

Author	SHA1	Message	Date
Bobby Wang	03cc3b359c	[pyspark] support a list of feature column names (#8117 )	2022-08-08 17:05:27 +08:00
Jiaming Yuan	bcc8679a05	Update CUDA docker image and NCCL. (#8139 )	2022-08-07 16:32:41 +08:00
Jiaming Yuan	2cba1d9fcc	Fix compatibility with latest cupy. (#8129 ) * Fix compatibility with latest cupy. * Freeze mypy.	2022-08-01 15:24:42 +08:00
Jiaming Yuan	546de5efd2	[pyspark] Cleanup data processing. (#8088 ) - Use numpy stack for handling list of arrays. - Reuse concat function from dask. - Prepare for `QuantileDMatrix`. - Remove unused code. - Use iterator for prediction to avoid initializing xgboost model	2022-07-26 15:00:52 +08:00
Jiaming Yuan	3970e4e6bb	Move pylint helper from dmlc-core. (#8101 ) * Move pylint helper from dmlc-core. - Move the helper into the XGBoost ci_build. - Run it with multiprocessing. * Fix original test.	2022-07-23 08:12:37 +08:00
Jiaming Yuan	8bdea72688	[Python] Require black and isort for new Python files. (#8096 ) * [Python] Require black and isort for new Python files. - Require black and isort for spark and dask module. These files are relatively new and are more conform to the black formatter. We will convert the rest of the library as we move forward. Other libraries including dask/distributed and optuna use the same formatting style and have a more strict standard. The black formatter is indeed quite nice, automating it can help us unify the code style. - Gather Python checks into a single script.	2022-07-20 10:25:24 +08:00
Jiaming Yuan	dae7a41baa	Update Python requirement to >=3.8. (#8071 ) Additional changes: - Use mamba for CPU test on Jenkins. - Cleanup CPU test dependencies. - Restore some of the modin tests	2022-07-14 18:01:47 +08:00
WeichenXu	176fec8789	PySpark XGBoost integration (#8020 ) Co-authored-by: Hyunsu Cho <chohyu01@cs.washington.edu> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com>	2022-07-13 13:11:18 +08:00
Bobby Wang	e44a082620	[jvm-packages] update nccl version to 2.12.12-1 (#8015 )	2022-06-21 17:34:09 +08:00
Jiaming Yuan	637e42a0c0	Use 22.04 for RMM. (#8001 ) 22.06 is not released yet.	2022-06-17 04:07:31 +08:00
Jiaming Yuan	d48123d23b	Fix rmm build (#7973 ) - Optionally switch to c++17 - Use rmm CMake target. - Workaround compiler errors. - Fix GPUMetric inheritance. - Run death tests even if it's built with RMM support. Co-authored-by: jakirkham <jakirkham@gmail.com>	2022-06-06 20:18:32 +08:00
Philip Hyunsu Cho	47224dd6d3	Use private mirror to host llvm-openmp tarballs (#7950 )	2022-05-27 14:56:59 -07:00
Philip Hyunsu Cho	2070afea02	[CI] Rotate package repository keys (#7943 )	2022-05-26 17:06:46 -07:00
Jiaming Yuan	50d854e02e	[CI] Test with latest RAPIDS. (#7816 )	2022-04-30 11:55:10 -07:00
Bobby Wang	1b103e1f5f	[CI] make container be able to re-attached (#7848 ) When re-starting the container, it will fail in entrypoint.sh which will exit when adding an existing group or user	2022-04-29 19:00:35 -07:00
Jiaming Yuan	fd78af404b	Drop support for deprecated CUDA architectures. (#7774 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2022-03-31 21:42:23 +08:00
Philip Hyunsu Cho	e8eff3581b	[CI] Enable faulthandler to show details when 0xC0000005 error occurs (#7771 ) (#7775 )	2022-03-31 17:40:06 +08:00
Xiaochang Wu	613ec36c5a	Support building SimpleDMatrix from Arrow data format (#7512 ) * Integrate with Arrow C data API. * Support Arrow dataset. * Support Arrow table. Co-authored-by: Xiaochang Wu <xiaochang.wu@intel.com> Co-authored-by: Jiaming Yuan <jm.yuan@outlook.com> Co-authored-by: Zhang Zhang <zhang.zhang@intel.com>	2022-03-15 13:25:19 +08:00
Philip Hyunsu Cho	1b25dd59f9	Use CUDA 11 in clang-tidy (#7701 ) * Show command args when clang-tidy fails * Add option to specify CUDA args * Use clang-tidy 11 * [CI] Use CUDA 11	2022-02-24 15:15:07 -08:00
Philip Hyunsu Cho	0149f81a5a	[CI] Fix S3 upload (#7662 )	2022-02-16 01:35:27 -08:00
Philip Hyunsu Cho	34a238ca98	[CI] Clean up Python wheel build pipeline (#7626 ) * [CI] Always upload artifacts to [branch_name]/ * [CI] Move detailed setup inside build_python_wheels.sh * Fix typo	2022-02-03 00:55:44 -08:00
Philip Hyunsu Cho	f6e6d0b2c0	[CI] Build Python wheels for MacOS (x86_64 and arm64) (#7621 ) * Build Python wheels for OSX (x86_64 and arm64) * Use Conda's libomp when running Python tests * fix * Add comment to explain CIBW_TARGET_OSX_ARM64 * Update release script * Add comments in build_python_wheels.sh * Document wheel pipeline	2022-02-02 17:35:48 -08:00
Jiaming Yuan	9f20a3315e	Test with latest numpy. (#7573 )	2022-01-19 00:46:23 +08:00
Jiaming Yuan	a1bcd33a3b	[breaking] Change internal model serialization to UBJSON. (#7556 ) * Use typed array for models. * Change the memory snapshot format. * Add new C API for saving to raw format.	2022-01-16 02:11:53 +08:00
Bobby Wang	e8c1eb99e4	[jvm-package] Clean up the legacy gpu support tests (#7523 )	2021-12-21 09:15:51 +08:00
Jiaming Yuan	820e1c01ef	Fix macos package upload. (#7475 ) * Split up the tests.	2021-11-24 03:43:49 +08:00
Philip Hyunsu Cho	2adf222fb2	[CI] CI cost saving (#7407 ) * [CI] Drop CUDA 10.1; Require 11.0 * Change NCCL version * Use CUDA 10.1 for clang-tidy, for now * Remove JDK 11 and 12 * Fix NCCL version * Don't require 11.0 just yet, until clang-tidy is fixed * Skip MultiClassesSerializationTest.GpuHist	2021-11-17 21:02:20 -08:00
Jiaming Yuan	3b0b74fa94	[doc] Use RTD theme. (#7346 )	2021-10-19 23:49:19 -07:00
Jiaming Yuan	ca17f8a5fc	Dispatch thrust versions and upgrade rmm. (#7254 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-09-25 03:43:23 +08:00
Bobby Wang	0ee11dac77	[jvm-packages][xgboost4j-gpu] Support GPU dataframe and `DeviceQuantileDMatrix` (#7195 ) Following classes are added to support dataframe in java binding: - `Column` is an abstract type for a single column in tabular data. - `ColumnBatch` is an abstract type for dataframe. - `CuDFColumn` is an implementaiton of `Column` that consume cuDF column - `CudfColumnBatch` is an implementation of `ColumnBatch` that consumes cuDF dataframe. - `DeviceQuantileDMatrix` is the interface for quantized data. The Java implementation mimics the Python interface and uses `__cuda_array_interface__` protocol for memory indexing. One difference is on JVM package, the data batch is staged on the host as java iterators cannot be reset. Co-authored-by: jiamingy <jm.yuan@outlook.com>	2021-09-24 14:25:00 +08:00
Philip Hyunsu Cho	3060f0b562	[CI] Automatically build GPU-enabled R package for Windows (#7185 ) * [CI] Automatically build GPU-enabled R package for Windows * Update Jenkinsfile-win64 * Build R package for the release branch only * Update install doc	2021-08-25 02:11:01 -07:00
Philip Hyunsu Cho	d04312b9c0	[CI] Fix hanging Python setup in Windows CI (#7186 )	2021-08-24 22:03:51 -07:00
Philip Hyunsu Cho	f1a4a1ac95	[CI] Upgrade build image to CentOS 7 + GCC 8; require CUDA 10.1 and later (#7141 )	2021-07-29 10:54:33 -07:00
Jiaming Yuan	345796825f	Optional find dependency in installed cmake config. (#7099 ) * Find dependency only when xgboost is built as static library. * Resolve msvc warning. * Add test for linking shared library.	2021-07-11 17:20:55 +08:00
Jiaming Yuan	f937f514aa	Remove lz4 compression with external memory. (#7076 )	2021-07-06 14:46:43 +08:00
Philip Hyunsu Cho	b2d300e727	[CI] Upgrade to CMake 3.14 (#7060 ) * [CI] Upgrade to CMake 3.14 * Add FATAL_ERROR directive, for users with CMake 2.x	2021-06-24 18:07:24 -07:00
Jiaming Yuan	86715e4cd4	Support categorical data for dask functional interface and DQM. (#7043 ) * Support categorical data for dask functional interface and DQM. * Implement categorical data support for GPU GK-merge. * Add support for dask functional interface. * Add support for DQM. * Get newer cupy.	2021-06-18 13:06:52 +08:00
Jiaming Yuan	dcd84b3979	[CI] Configure RAPIDS, dask, modin (#7033 ) Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>	2021-06-18 10:27:51 +08:00
Jiaming Yuan	7beb2f7fae	Hide symbols in CI build + hide symbols for C and CUDA (#6798 ) * Hide symbols in CI build. * Hide symbols for other languages.	2021-06-04 02:35:46 +08:00
Philip Hyunsu Cho	05db6a6c29	[CI] Upgrade cuDF and RMM to 21.06 nightly (#7012 ) * [CI] Upgrade cuDF and RMM to 21.06 nightly * Trim outdated test cases * Pin Dask version to 2021.05.0 for now	2021-06-02 11:59:30 -07:00
Jiaming Yuan	ee4f51a631	Support for all primitive types from array. (#7003 ) * Change C API name. * Test for all primitive types from array. * Add native support for CPU 128 float. * Convert boolean and float16 in Python. * Fix dask version for now.	2021-06-01 08:34:48 +08:00
Jiaming Yuan	29d6a5e2b8	[CI] Move appveyor tests to action (#6986 ) * Drop support for VS14, use VS15 instead. * Drop support for mingw. * Remove debug build. * Split up jvm tests. * Split up Python tests.	2021-05-27 04:49:45 +08:00
Philip Hyunsu Cho	c6d87e5e18	[CI] Remove stray build artifact to avoid error in artifact packaging (#6994 )	2021-05-25 19:48:27 +08:00
Philip Hyunsu Cho	90cd724be1	[CI] Fix CI/CD pipeline broken by latest auditwheel (4.0.0) (#6951 )	2021-05-10 22:43:15 -07:00
Jiaming Yuan	34df1f588b	Reduce Travis environment setup time. (#6912 ) * Remove unused r from travis. * Don't update homebrew. * Don't install indirect/unused dependencies like libgit2, wget, openssl. * Move graphviz installation to conda.	2021-04-30 09:02:40 +08:00
Philip Hyunsu Cho	ea7a6a0321	[CI] Pack R package tarball with pre-built xgboost.so (with GPU support) (#6827 ) * Add scripts for packaging R package with GPU-enabled libxgboost.so * [CI] Automatically build R package tarball * Add comments * Don't build tarball for pull requests * Update the installation doc	2021-04-07 21:15:34 -07:00
Philip Hyunsu Cho	366f3cb9d8	Add use_rmm flag to global configuration (#6656 ) * Ensure RMM is 0.18 or later * Add use_rmm flag to global configuration * Modify XGBCachingDeviceAllocatorImpl to skip CUB when use_rmm=True * Update the demo * [CI] Pin NumPy to 1.19.4, since NumPy 1.19.5 doesn't work with latest Shap	2021-03-09 14:53:05 -08:00
Jiaming Yuan	4656b09d5d	[breaking] Add prediction fucntion for DMatrix and use inplace predict for dask. (#6668 ) * Add a new API function for predicting on `DMatrix`. This function aligns with rest of the `XGBoosterPredictFrom` functions on semantic of function arguments. Purge `ntree_limit` from libxgboost, use iteration instead. * [dask] Use `inplace_predict` by default for dask sklearn models. * [dask] Run prediction shape inference on worker instead of client. The breaking change is in the Python sklearn `apply` function, I made it to be consistent with other prediction functions where `best_iteration` is used by default.	2021-02-08 18:26:32 +08:00
Philip Hyunsu Cho	55ee2bd77f	[CI] Add ARM64 test to Jenkins pipeline (#6643 ) * Add ARM64 test to Jenkins pipeline * Check for bundled libgomp * Use a separate test suite for ARM64 * Ensure that x86 jobs don't run on ARM workers	2021-01-27 21:51:17 +09:00
Jiaming Yuan	610ee632cc	[Breaking] Rename `data` to `X` in `predict_proba`. (#6555 ) New Scikit-Learn version uses keyword argument, and `X` is the predefined keyword. * Use pip to install latest Python graphviz on Windows CI.	2020-12-28 21:36:03 +08:00

1 2 3 4 5

210 Commits