Docker Installation

Install docker engine and Nvidia-docker on related platforms.

Problem of seeing CUDA11 on docker while the host CUDA version in 11.

One of the primary functions of nvidia-docker is to inject the all of NVIDIA driver libs from the host into the container so that the container will run properly with GPUs. One of these libraries is libcuda.so. This is one of the reasons you are seeing nvidia-smi report the driver version from your host.

Different between nvidia docker image: base/runtime/devel

Example Template Format of Dockerfile

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
FROM nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04
ENV PATH="/root/miniconda3/bin:${PATH}"
ARG PATH="/root/miniconda3/bin:${PATH}"

RUN apt update \
    && apt install -y htop python3-dev wget

RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    && mkdir root/.conda \
    && sh Miniconda3-latest-Linux-x86_64.sh -b \
    && rm -f Miniconda3-latest-Linux-x86_64.sh

RUN conda create -y -n env_name python=3.7

COPY . home/

RUN /bin/bash -c "cd home/ \
    && source activate env_name \
    && pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt"

WORKDIR home/
CMD []

Build your customized image through Dockerfile

1
$ docker build -t nameOfyouImage .
1
2
3
4
5
$ docker run -it nameOfyouImage /bin/bash \
           -v absolute path of your localhost:absolute path of your remotehost \
           -p [porting] \
           --rm [run the container the remove it after exit it]
           --gpus

Docker run contains docker create and run two commands, it first creates a container based on one image, then it runs /bin/bash on the container. One can use exit to stop the container or can push Ctrl + P + Q to detach container’s terminal and keep the container running.

1
$ docker ps

Checking the running container, option -a can be used to check the exited containers.

1
$ docker start CONTAINERID
1
2
$ docker attach CONTAINERID
$ docker exec -ti CONTAINERID  # exec abd attach both require the container is running

Through docker ps , we could use the ID number of the container to restart it and attach to it based on the specified commends.

Saving and loading the Docker image

Some of the servers or clusters are offline, so we may package the image we want and upload it to the remote server.

1
$ docker commit   # commit the change to the image
1
$ docker save myimage:latest | gzip > myimage_latest.tar.gz
1
$ docker image load -i myimage_latest.tar.gz

Source

https://www.youtube.com/watch?v=0qG_0CPQhpg