* [CI] Add RMM as an optional dependency * Replace caching allocator with pool allocator from RMM * Revert "Replace caching allocator with pool allocator from RMM" This reverts commit e15845d4e72e890c2babe31a988b26503a7d9038. * Use rmm::mr::get_default_resource() * Try setting default resource (doesn't work yet) * Allocate pool_mr in the heap * Prevent leaking pool_mr handle * Separate EXPECT_DEATH() in separate test suite suffixed DeathTest * Turn off death tests for RMM * Address reviewer's feedback * Prevent leaking of cuda_mr * Fix Jenkinsfile syntax * Remove unnecessary function in Jenkinsfile * [CI] Install NCCL into RMM container * Run Python tests * Try building with RMM, CUDA 10.0 * Do not use RMM for CUDA 10.0 target * Actually test for test_rmm flag * Fix TestPythonGPU * Use CNMeM allocator, since pool allocator doesn't yet support multiGPU * Use 10.0 container to build RMM-enabled XGBoost * Revert "Use 10.0 container to build RMM-enabled XGBoost" This reverts commit 789021fa31112e25b683aef39fff375403060141. * Fix Jenkinsfile * [CI] Assign larger /dev/shm to NCCL * Use 10.2 artifact to run multi-GPU Python tests * Add CUDA 10.0 -> 11.0 cross-version test; remove CUDA 10.0 target * Rename Conda env rmm_test -> gpu_test * Use env var to opt into CNMeM pool for C++ tests * Use identical CUDA version for RMM builds and tests * Use Pytest fixtures to enable RMM pool in Python tests * Move RMM to plugin/CMakeLists.txt; use PLUGIN_RMM * Use per-device MR; use command arg in gtest * Set CMake prefix path to use Conda env * Use 0.15 nightly version of RMM * Remove unnecessary header * Fix a unit test when cudf is missing * Add RMM demos * Remove print() * Use HostDeviceVector in GPU predictor * Simplify pytest setup; use LocalCUDACluster fixture * Address reviewers' commments Co-authored-by: Hyunsu Cho <chohyu01@cs.wasshington.edu>
48 lines
1.7 KiB
Docker
48 lines
1.7 KiB
Docker
ARG CUDA_VERSION
|
|
FROM nvidia/cuda:$CUDA_VERSION-devel-ubuntu16.04
|
|
ARG CUDA_VERSION
|
|
|
|
# Environment
|
|
ENV DEBIAN_FRONTEND noninteractive
|
|
SHELL ["/bin/bash", "-c"] # Use Bash as shell
|
|
|
|
# Install all basic requirements
|
|
RUN \
|
|
apt-get update && \
|
|
apt-get install -y wget unzip bzip2 libgomp1 build-essential ninja-build git && \
|
|
# Python
|
|
wget -O Miniconda3.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
|
|
bash Miniconda3.sh -b -p /opt/python && \
|
|
# CMake
|
|
wget -nv -nc https://cmake.org/files/v3.13/cmake-3.13.0-Linux-x86_64.sh --no-check-certificate && \
|
|
bash cmake-3.13.0-Linux-x86_64.sh --skip-license --prefix=/usr
|
|
|
|
# NCCL2 (License: https://docs.nvidia.com/deeplearning/sdk/nccl-sla/index.html)
|
|
RUN \
|
|
export CUDA_SHORT=`echo $CUDA_VERSION | egrep -o '[0-9]+\.[0-9]'` && \
|
|
export NCCL_VERSION=2.7.5-1 && \
|
|
apt-get update && \
|
|
apt-get install -y --allow-downgrades --allow-change-held-packages libnccl2=${NCCL_VERSION}+cuda${CUDA_SHORT} libnccl-dev=${NCCL_VERSION}+cuda${CUDA_SHORT}
|
|
|
|
ENV PATH=/opt/python/bin:$PATH
|
|
|
|
# Create new Conda environment with RMM
|
|
RUN \
|
|
conda create -n gpu_test -c nvidia -c rapidsai-nightly -c rapidsai -c conda-forge -c defaults \
|
|
python=3.7 rmm=0.15* cudatoolkit=$CUDA_VERSION
|
|
|
|
ENV GOSU_VERSION 1.10
|
|
|
|
# Install lightweight sudo (not bound to TTY)
|
|
RUN set -ex; \
|
|
wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-amd64" && \
|
|
chmod +x /usr/local/bin/gosu && \
|
|
gosu nobody true
|
|
|
|
# Default entry-point to use if running locally
|
|
# It will preserve attributes of created files
|
|
COPY entrypoint.sh /scripts/
|
|
|
|
WORKDIR /workspace
|
|
ENTRYPOINT ["/scripts/entrypoint.sh"]
|