Container for Deep Learning Environment
containerizing and deploying deep learning environments (e.g., Docker, Apptainer)
Preface
Some time ago, I was tortured by some cumbersome processes about configuring the deep learning environment on Compute Canada. This process involved numerous compatibility issues, conflicting dependencies, which made it time-consuming and frustrating. Recently, I spent some time in learning Apptainer
, the Container recommended on Compute Canada, and wrote this blog to note the key points down.
Introduction
The technologies of Container
, including platforms like Docker and Apptainer, provide a method to package, distribute, and run applications in a standardized environment, ensuring consistency across different computing environments. (from ChatGPT)
- Definition: Containers are lightweight, portable units that package an application and its dependencies, libraries, and configuration files, allowing the application to run reliably in different computing environments.
- Isolation: They provide a level of isolation between applications, meaning multiple containers can run on the same host without interfering with each other.
- Efficiency: Containers share the host operating system’s kernel, making them more efficient than traditional virtual machines, which require a full OS stack.
Concepts Comparisons between Docker and Apptainer
Concepts in Docker
Docker is one of the most popular containerization platforms that simplifies the process of creating, deploying, and managing containers.
Key Features:
- Docker Images: Read-only templates used to create containers. An image includes the application code, libraries, and dependencies.
- Docker Hub: A cloud-based registry where developers can share and distribute Docker images.
- Docker Compose: A tool for defining and managing multi-container applications with a single YAML configuration file.
- Docker Swarm: Native clustering and orchestration solution for managing a group of Docker engines as a single virtual host.
Differences between Docker Image and Docker Container
Key differeces:
Aspect | Docker Image | Docker Container |
---|---|---|
Definition | A read-only template for creating containers | A running instance of a Docker image |
State | Immutable | Mutable during runtime |
Function | Blueprint or recipe | Execution of the image as a running environment |
Storage | Stored in Docker registries | Lives in memory when running |
Purpose | To provide the environment setup for containers | To run applications in an isolated environment |
Lifecycle | Static and reusable | Dynamic and has a start/stop/remove lifecycle |
Concepts in Apptainer
Apptainer is specifically designed for high-performance computing (HPC) and scientific workloads, allowing users to create and run containers in environments where Docker might not be suitable.
Key Features:
- User-Focused: Unlike Docker, which requires root privileges to manage containers, Apptainer allows users to create and run containers without needing elevated permissions, making it safer for multi-user environments.
- Integration with HPC: Apptainer is optimized for HPC environments, enabling users to leverage existing tools and resources.
- Image Formats: Supports different image formats, allowing for flexibility in how images are created and used.
- Simplicity: Focuses on simplicity and ease of use, making it suitable for researchers and scientists who may not have extensive experience with containerization.
Comparisons between Docker and Apptainer
There are some differences between the concepts in Docker and Apptainer.
Feature | Docker | Apptainer (Singularity) |
---|---|---|
Image | composed of multiple layers (i.e., represent incremental changes for building) with Union File System; | a single, read-only file *.sif ; |
Container | Mutable, changes can persist in the container; | immutable by default, easy reproducibility |
Target Audience | Developers, DevOps, microservices | Researchers, HPC environments |
Security | Requires root privileges, runs as root | User-level execution, no root required |
Image Format | Layered images (multi-step build) | Single-file .sif images for portability |
Container Creation | Built with Dockerfile , mutable containers | Can use Docker images, runs immutable containers |
Ecosystem | Large ecosystem, Docker Hub, Kubernetes | Focused on HPC, integrates with cluster schedulers |
Performance | Small overhead, optimized for microservices | Low overhead, optimized for HPC performance |
Portability | Portable across environments with Docker installed | Highly portable, especially in HPC environments |
Reproducibility | Good for general use, less suited for exact replication | Designed for exact scientific reproducibility |
Generally speaking, Apptainer is a Secure Alternative to Docker and it is adopted on many scientific computer clustering, like Digital Research Alliance of Canada. That is because Docker images are not secure because they provide a means to gain root access to the system they are running on.
Docker (with Nvidia-Container-Toolkit)
Remark: Since version=19.03
, Docker began to support NVIDIA GPU, we don’t need to install nvidia-docker separately. We should use Docker + Nvidia-Container-Toolkit (documentation).
Installation
Install NVIDIA GPU driver & CUDA
# Step 1: Prepare Your System
sudo apt update && sudo apt upgrade -y
sudo apt-get purge nvidia* -y
# Step 2: Install NVIDIA GPU Driver
sudo add-apt-repository ppa:graphics-drivers/ppa -y
sudo apt update
sudo ubuntu-drivers autoinstall
sudo reboot
# Verify the installation
nvidia-smi
# Step 3: Install CUDA Toolkit
# Replace with the appropriate download link for your CUDA version
CUDA_VERSION=12.6.0
wget https://developer.download.nvidia.com/compute/cuda/${CUDA_VERSION}/local_installers/cuda-repo-ubuntu2204-${CUDA_VERSION}-local_${CUDA_VERSION}-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-${CUDA_VERSION}-local_${CUDA_VERSION}-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu2204-${CUDA_VERSION}-local/cuda-archive-keyring.gpg
sudo apt update
sudo apt install cuda -y
sudo reboot
# Step 4: Set Environment Variables
echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
# Step 5: Verify CUDA Installation
nvcc --version
Install Docker
# Step 1: Update Your System
sudo apt update && sudo apt upgrade -y
# Step 2: Install Required Packages
sudo apt install apt-transport-https ca-certificates curl software-properties-common -y
# Step 3: Add Docker’s Official GPG Key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# Step 4: Add Docker’s Official Repository
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
# Step 5: Update Package Index Again
sudo apt update
# Step 6: Install Docker
sudo apt install docker-ce -y
# Step 7: Start and Enable Docker
sudo systemctl start docker
sudo systemctl enable docker
# Step 8: Verify Docker Installation
docker --version
# Step 9: (Optional) Run Docker as Non-Root User
sudo usermod -aG docker $USER
newgrp docker
# Verify Docker without sudo
docker run hello-world
Install Nvidia-Container-Toolkit
References: install guid for Nvidia-Container-Toolkit
Configure the production repository:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# [Optional] configure the repository to use experimental packages:
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Update the packages list from the repository:
sudo apt-get update
# Install the NVIDIA Container Toolkit packages:
sudo apt-get install -y nvidia-container-toolkit
Configure container runtime as Docker
# Configure the container runtime by using the `nvidia-ctk` command:
sudo nvidia-ctk runtime configure --runtime=docker
The nvidia-ctk command modifies the /etc/docker/daemon.json
file on the host. The file is updated so that Docker can use the NVIDIA Container Runtime.
Restart the Docker daemon:
sudo systemctl restart docker
[Rootless mode]
To configure the container runtime for Docker running in Rootless mode, follow these steps:
Configure the container runtime by using the nvidia-ctk
command:
nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json
# Restart the Rootless Docker daemon:
systemctl --user restart docker
# Configure ```/etc/nvidia-container-runtime/config.toml``` by using the `sudo nvidia-ctk` command:
sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place
Create Images and Containers
Pull Image from remote
- Pull an Image from remote
sudo docker pull nvidia/cuda:12.6.0-cudnn-runtime-ubuntu24.04
Build with Dockerfile
A template of Dockerfile
:
# Dockerfile
# Use the NVIDIA CUDA runtime image
FROM nvidia/cuda:12.6.0-cudnn-runtime-ubuntu24.04
# Set the working directory inside the container
WORKDIR /workspace
ENV PROJECT_DIR=${WORKDIR}/project # remember to `docker run -v <host_path>:<container_path>`
ENV VENV_DIR=${WORKDIR}/venv
# update and upgrade
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
wget \
curl \
git \
vim \
ca-certificates \
python3 \
python3-pip \
python3-dev \
python3-virtualenv \
zsh \
&& rm -rf /var/lib/apt/lists/*
# Configure zsh
RUN sh -c "$(wget https://raw.githubusercontent.com/ohmyzsh/ohmyzsh/master/tools/install.sh -O -)" --unattended
RUN chsh -s $(which zsh)
RUN git clone https://github.com/caiogondim/bullet-train.zsh.git ~/.oh-my-zsh/themes/bullet-train.zsh
RUN cp ~/.oh-my-zsh/themes/bullet-train.zsh/bullet-train.zsh-theme ~/.oh-my-zsh/themes
RUN sed -i 's/robbyrussell/bullet-train/g' ~/.zshrc
RUN sed -i '$a\# use command-not-found package\n[[ -a "/etc/zsh_command_not_found" ]] && \. /etc/zsh_command_not_found\n' ~/.zshrc
RUN zsh
# create python virtual environment
RUN ln -s /usr/bin/python3 /usr/bin/python
RUN python -m virtualenv $VENV_DIR
## rather than "source xxx/activate", activate venv by setting $PATH
ENV PATH="$VENV_DIR/bin:$PATH"
RUN pip install --upgrade pip
RUN pip install -r ${PROJECT_DIR}/requirements.txt
RUN pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
# Set environment variables for CUDA
ENV CUDA_HOME=/usr/local/cuda
ENV PATH=$CUDA_HOME/bin:$PATH
ENV LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
# default parameters that can be override by the Command Line Interface (CLI)
CMD ["python", "--version"] # Executable format
CMD python --version # Shell format
# default parameters that CANNOT be override by CLI
ENTRYPOINT ["python"]
CMD ["main.py"]
### Will always run `python`
### if `docker run <Image_name>`, will execute `python main.py`
### if `docker run <Image_name> other.py`, will execute "pythons other.py"
Build the image from Dockerfile
:
sudo docker build -t [Img_Name][:[Tag]] -f ./Dockerfile
How to set the Dockerfile for the future use on running your own program?
We should set something for
CMD
andENTRYPOINT
. The following table provides their differences. In general, the arguments inCMD
can be override by the provided<args>
when runningdocker run <Image_name> <args>
, but those inENTRYPOINT
CANNOT be override by<args>
.
Feature | CMD | ENTRYPOINT |
---|---|---|
Purpose | Sets default commands and parameters for running the container. | Configures a container to run as an executable with a specified command. |
Overriding | The command specified in CMD can be easily overridden by arguments provided to docker run . | The command defined in ENTRYPOINT is not easily overridden, but it can be extended with additional arguments. |
Forms | Can be specified in shell form or exec form. | Typically specified in exec form (recommended), but can also be in shell form. |
Default Behavior | If you specify both CMD and ENTRYPOINT , CMD provides default arguments to ENTRYPOINT . | The command defined will always run when the container starts. |
Use Case | Useful for providing default commands or options for applications where you want to allow flexibility in overriding them. | Useful for applications that need to run a specific command or script, ensuring that the container behaves consistently. |
Example | CMD ["python", "app.py"] sets default arguments. | ENTRYPOINT ["python"] ensures that the container always runs Python, even if a different command is specified. |
Delelte Image or Container
# for Containers
sudo docker rm <container_id>
# for Images
sudo docker rmi [Img_Name][:[Tag]]
Run a Container based on the Image
- Directly create a Container from Image (pull if not exist & run):
# (first search `nvidia/cuda:12.6.0-cudnn-runtime-ubuntu24.04` at local; if not exists, pull from remote)
sudo docker run -it --gpus=all --name=ubuntu2404-dl --env NVIDIA_DISABLE_REQUIRE=1 nvidia/cuda:12.6.0-cudnn-runtime-ubuntu24.04 bash
A detailed description for options in docker run
Option | Description | Example |
---|---|---|
-it | Runs the container interactively, attaching a terminal (-i for interactive, -t for TTY). | docker run -it ubuntu /bin/bash — Runs an interactive Ubuntu container with a bash shell. |
-d | Runs the container in detached mode (in the background). | docker run -d nginx — Runs the NGINX web server in the background. |
--name | Assigns a name to the container. | docker run --name my_container ubuntu — Names the container “my_container”. |
-p | Publishes ports from the container to the host. Maps container ports to host ports (host_port:container_port ). | docker run -p 8080:80 nginx — Maps port 80 inside the container to port 8080 on the host. |
-v | Mounts a volume or directory from the host to the container (<host_path>:<container_path> ). | docker run -v $(pwd):/app ubuntu — Mounts the current directory to /app in the container. |
-e | Sets environment variables inside the container. | docker run -e MYSQL_ROOT_PASSWORD=my_password mysql — Sets the MySQL root password as an environment variable. |
--rm | Automatically removes the container when it exits. | docker run --rm ubuntu — Automatically removes the container after it stops. |
--network | Specifies the network mode for the container (e.g., bridge , host , none ). | docker run --network host nginx — Uses the host’s network stack. |
--restart | Configures the restart policy (e.g., no , on-failure , always ). | docker run --restart always nginx — Ensures the container restarts automatically if it stops. |
-w | Sets the working directory inside the container. | docker run -w /app node — Sets the working directory inside the container to /app . |
--link | Links two containers so that they can communicate. | docker run --link db_container:db app_container — Links db_container to app_container and sets an alias db for the linked container. |
--cpus | Limits the number of CPUs available to the container. | docker run --cpus=2 ubuntu — Limits the container to use 2 CPUs. |
--memory | Limits the memory available to the container (e.g., 512m , 1g ). | docker run --memory=512m ubuntu — Limits the container to 512 MB of memory. |
--privileged | Grants extended privileges to the container (useful for hardware access or running Docker-in-Docker). | docker run --privileged ubuntu — Runs the container with full privileges. |
-u | Runs the container as a specific user. | docker run -u 1001 ubuntu — Runs the container with user ID 1001. |
--env-file | Loads environment variables from a file. | docker run --env-file ./env.list ubuntu — Loads environment variables from the env.list file. |
--device | Adds a host device to the container (e.g., hardware devices). | docker run --device /dev/sda:/dev/xvda ubuntu — Adds a device from the host to the container. |
--entrypoint | Overwrites the default entrypoint of the image with a custom command. | docker run --entrypoint /bin/bash ubuntu — Runs /bin/bash as the entrypoint instead of the default. |
--log-driver | Specifies the log driver for the container (e.g., json-file , syslog , none ). | docker run --log-driver syslog ubuntu — Uses syslog as the log driver. |
--cap-add | Adds Linux capabilities to the container (e.g., NET_ADMIN , SYS_TIME ). | docker run --cap-add=NET_ADMIN ubuntu — Grants the container the NET_ADMIN capability for network administration tasks. |
--gpus | Allocates GPUs to the container (useful for machine learning or GPU-intensive tasks). | docker run --gpus all nvidia/cuda:latest — Allocates all available GPUs to the container. |
--detach-keys | Specifies a key sequence to detach from the container. | docker run --detach-keys="ctrl-x" ubuntu — Uses Ctrl+X to detach from the container. |
- Or start an existing container, and enter this container with
bash
:
# start on backend
sudo docker start <container_id>
# start with interactive mode
sudo docker exec -it <container_id> bash
Modify, Commit and Push your own Image
# run a Container with interactive mode
sudo docker run -it --gpus=all --name=<container_name> --env NVIDIA_DISABLE_REQUIRE=1 [Img_Name][:[Tag]] bash
##### container since here: #####
# install something for your environment
pip install xxx
sudo docker commit -m="[Msg]" -a="[Author_Info]" <container_id> lijiaqiisai/ubuntu24.04-cuda-python3:v0.1
This will create a new Image
named lijiaqiisai/dl-env-torch:torch-2.4.1
You can create a tag of this Image
with
sudo docker tag lijiaqiisai/ubuntu24.04-cuda-python3:v0.1 lijiaqiisai/ubuntu24.04-cuda-python3:latest
You can push it to your Docker Hub with
sudo docker push lijiaqiisai/ubuntu24.04-cuda-python3:latest
sudo docker push lijiaqiisai/ubuntu24.04-cuda-python3:v0.1
Remark: differences between docker commit
and docker tag
Feature | docker commit | docker tag |
---|---|---|
Purpose | Create a new image from a container | Add a new tag to an existing image |
Operation | Saves changes made in a container | Renames or versions an existing image |
Creates a layer | Yes | No |
Syntax | docker commit <CONTAINER> <IMAGE> | docker tag <SOURCE_IMAGE> <TARGET_IMAGE> |
Execute your program with Docker Container
Create a new Container for your program:
- Manually
# adding '--rm' will remove it after closing the container
sudo docker run -it --rm --gpus=all --env NVIDIA_DISABLE_REQUIRE=1 -v /[Local_Project_Path]:/workspace/project [Img_Name][:[Tag]] zsh
cd ${PROJECT_DIR}
python main.py
- Automatically on Container Start
(Option 1): Configure CMD
in the Dockerfile
# add this line to the Dockerfile
CMD ["python", "/workspace/project/train.py"]
then run
sudo docker run -it --rm --gpus=all --env NVIDIA_DISABLE_REQUIRE=1 -v /[Local_Project_Path]:/workspace/project [Img_Name][:[Tag]]
(Option 2): Pass Script as a Command
sudo docker run -it --rm --gpus=all --env NVIDIA_DISABLE_REQUIRE=1 -v /[Local_Project_Path]:/workspace/project [Img_Name][:[Tag]] cd ${PROJECT_DIR} && python main.py
Launch an existing Container for your program:
docker exec
will launch an existing Container rather than creating a new one
sudo docker exec -it --gpus=all --env NVIDIA_DISABLE_REQUIRE=1 -v /[Local_Project_Path]:/workspace/project <Container_id_or_name> <your_program>
Running Code via JupyterLab
sudo docker run -it -p 8888:8888 --rm --gpus=all --env NVIDIA_DISABLE_REQUIRE=1 -v /[Local_Project_Path]:/workspace/project [Img_Name][:[Tag]]
then you can access the Jupyter notebook via http://localhost:8888
Run a container on clusters with Slurm:
# Load any required modules (e.g., CUDA, Docker)
module load docker # if needed to load Docker via module system
PROJECT_DIR="[Local_Project_Path]"
DOCKER_IMAGE="[Img_Name][:[Tag]]"
# Run Docker with GPU support using --gpus flag
srun docker run --gpus all \
-v ${PROJECT_DIR}:/workspace/project \
${DOCKER_IMAGE} \
python /workspace/project/train.py
Apptainer
The concept in Apptainer
(previously-known as Singularity
) is less sophisticated. It only has the concept of Container
, that’s how we call each virtual instance that we run.
Installation
References: install apptainer
- Install unprivileged from pre-built binaries
curl -s https://raw.githubusercontent.com/apptainer/apptainer/main/tools/install-unprivileged.sh | \
bash -s - install-dir
- Install Debian packages
Pre-built Debian packages are only available on GitHub and only for the amd64 architecture.
For the non-setuid
installation use these commands:
sudo apt update
sudo apt install -y wget
cd /tmp
wget https://github.com/apptainer/apptainer/releases/download/v1.3.4/apptainer_1.3.4_amd64.deb
sudo apt install -y ./apptainer_1.3.4_amd64.deb
For the setuid
installation do above commands first and then these:
$ wget https://github.com/apptainer/apptainer/releases/download/v1.3.4/apptainer-suid_1.3.4_amd64.deb
$ sudo dpkg -i ./apptainer-suid_1.3.4_amd64.deb
- Install Ubuntu packages
# First, on Ubuntu based containers install software-properties-common package to obtain add-apt-repository command.
# On Ubuntu Desktop/Server derived systems skip this step.
sudo apt update
sudo apt install -y software-properties-common
# For the non-setuid installation use these commands:
sudo add-apt-repository -y ppa:apptainer/ppa
sudo apt update
sudo apt install -y apptainer
# For the setuid installation do above commands first and then these:
sudo add-apt-repository -y ppa:apptainer/ppa
sudo apt update
sudo apt install -y apptainer-suid
Local Apptainer Container Image can be built with two modes:
- generate a read-only
*.sif
file - create a mutable (read & write) sandbox directory with
--sandbox
option that can be modified later
Create Apptainer Image into .sif
(read-only)
Build a *.sif
Image with apptainer build
command:
Options for apptainer build
:
-
--nv
: inject host Nvidia libraries during build for post and test sections; -
--nvccli
: use nvidia-container-cli for GPU setup (experimental) -
--band src[:dest[:opts]]
or-B src[:dest[:opts]]
: a user-bind path specification. spec has the formatsrc[:dest[:opts]]
,wheresrc
anddest
are outside (on host machine) and inside (on container) paths. Ifdest
is not given,it is set equal tosrc
. Mount options (opts
) may be specified asro
(read-only) orrw
(read/write, which is the default). Multiple bind paths can be given by a comma separated list. -
-f, --fakeroot
: build with the appearance of running as root (default when building from a definition file unprivileged) -
--sandbox
: will be introduced in the next section
The Apptainer Image file *.sif
is read-only and portable.
Download from Docker Hub
apptainer build dl-env.sif docker://lijiaqiisai/ubuntu24.04-cuda-python3:cuda12.6.0-python3.12.3
Download from other Library API Registries
apptainer build/pull dl-env.sif [Library]://[Image_name]
Build from Apptainer definition file (analogue to Dockerfile
)
References: https://apptainer.org/docs/user/1.0/definition_files.html
Headers in Apptainer definition file
From a remote Registry (e.g., Docker Hub)
# from Docker Hub
Bootstrap: docker
From: lijiaqiisai/ubuntu24.04-cuda-python3:cuda12.6.0-py3.12.3
or from a local Image
Bootstrap: localimage
From: <Old_Image>.sif
Fingerprints: 12045C8C0B1004D058DE4BEDA20C27EE7FF7BA84,22045C8C0B1004D058DE4BEDA20C27EE7FF7BA84
Sections in Apptainer definition file
We explain the sections by the order of execution:
flowchart LR
A[header] --> B[%arguments];
B --> C[%setup];
C --> D[%file];
D --> E[%post];
E --> F[%test];
F --> G[%environment]
G --> H[%startscript]
G --> I[%runscript]
G --> J[%labels]
G --> K[%help]
\(\downarrow\) %arguments
: define custom arguments or flags that can be passed when building. The variables defined in %arguments
can be accessed in %setup
, %post
\(\downarrow\) %setup
: some commands that will be firstly executed on the host system outside of the container after the base OS has been installed. The container file system will be referred as environment variable $APPTAINER_ROOTFS
in this section
Warning:
Should be careful with the commands within
%setup
section since the operations are done on host system.
\(\downarrow\) %files
: allows to copy files into the container with greater safety than using the %setup
section
%files [from <stage>]
<source> [<destination>]
...
\(\downarrow\) %post
: download pacakges, install softwares and libraries, write configuration files, create new directories, etc;
apt-get update && apt-get install -y netcat
NOW=`date`
echo "export NOW=\"${NOW}\"" >> $APPTAINER_ENVIRONMENT
- Please note that the above commands also set an environmental variable
NOW
at build time. The value of this variable cannot be anticipated, and therefore cannot be set during the%environment
section. For situations like this, the$APPTAINER_ENVIRONMENT
variable is provided. Redirecting text to this variable will cause it to be written to a file called/.singularity.d/env/91-environment.sh
that will be sourced at runtime. - **Priority**
: environmental variables set in the
%post
section through$APPTAINER_ENVIRONMENT
> those added via%environment
.
\(\downarrow\) %test
: runs at the very end of the build process to validate the container using a method of your choice.
- can be excuted with
apptainer test *.sif
; - build with
--notest
option to build a container without running the%test
section, likesudo apptainer build --notest my_container.sif my_container.def
;
\(\downarrow\) %environment
: allows you to define environment variables that will be set at runtime.
- Note: variables in the
%environment
section are not made available at build time. This means that if you need the same variables during the build process, you should also define them in your%post
section. - during build: The
%environment
section is written to a file in the container metadata directory. This file is not sourced. - during runtime: The file in the container metadata directory is sourced.
\(\downarrow\) %startscript
: the contents of the %startscript
will be executed when apptainer instance start xxx
. Only once when starting the instance.
\(\downarrow\) %runscript
: The contents of the %runscript
will be executed when apptainer run xxx
. Every time the container is run
-
$*
: a single string that ehe options passed to the container at runtime -
$@
: a quoted array that the options are passed to echo
\(\downarrow\) %labels
: is used to add metadata to the file /.singularity.d/labels.json
within your container.
- To inspect the labels, using
apptainer inspect <Image>.sif
\(\downarrow\) %help
: any text in this section is transcribed into a metadata file in the container during the build.
TIP
The lifecycle of environmental variables defined in different sections
Place of environmental variables definition: | Available build-time | Available runtime | Notes |
---|---|---|---|
In %arguments | ✓ | ✘ | ONLY for building process in %setup or %post sections |
In %post : set by export xxx | ✓ | ✘ | |
In %post : set by $APPTAINER_ENVIRONMENT | ✓ | ✓ | Will save to /.singularity.d/env/91-environment.sh and will be sourced during runtime |
In %environment | ✘ | ✓ | lower priority compared to setting through $APPTAINER_ENVIRONMENT in %post |
Two examples
Base example:
The following *.def
file will pull from nvidia/cuda, and just install latest Python
.
# apptaner-base.def
# Header
Bootstrap: docker # from Docker Hub
From: nvidia/cuda:-cudnn-runtime-ubuntu
Stage: build
# sections
%arguments
CUDA_VERSION=12.6.0
OS_VERSION=24.04
%setup
%file
%post
# update, upgrade and set timezone
export TZ=America/Toronto
export DEBIAN_FRONTEND=noninteractive
apt-get update && apt-get upgrade -y --no-install-recommends
apt-get -y install tzdata
ln -fs /usr/share/zoneinfo/${TZ} /etc/localtime
dpkg-reconfigure -f noninteractive tzdata
export DEBIAN_FRONTEND=dialog
### Fix the warning "Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details."
# cd /etc/apt && cp trusted.gpg trusted.gpg.d && cd -
# Install softwares
apt-get install -y --no-install-recommends python3 python3-pip python3-virtualenv
ln -s /usr/bin/python3 /usr/bin/python
export VENV_DIR=/workshapce/venv
python -m virtualenv $VENV_DIR
### hold these variables after building process
NOW=`date`
echo "export NOW=\"${NOW}\"" >> $APPTAINER_ENVIRONMENT
echo "export VENV_DIR=\"${VENV_DIR}\"" >> $APPTAINER_ENVIRONMENT
echo "export CUDA_VERSION=\"${CUDA_VERSION}\"" >> $APPTAINER_ENVIRONMENT
echo "export OS_VERSION=\"${OS_VERSION}\"" >> $APPTAINER_ENVIRONMENT
%test
grep -q NAME=\"Ubuntu\" /etc/os-release
if [ $? -eq 0 ]; then
echo "Container base is Ubuntu as expected."
else
echo "Container base is not Ubuntu."
exit 1
fi
%environment
export LISTEN_PORT=12345
export LC_ALL=C
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=${VENV_DIR}/bin:$PATH
# will be executed when executing 'apptainer instance start xxx.sif`
%startscript
echo "Container was created $NOW"
echo "Ubuntu version: $OS_VERSION"
echo "CUDA version: $CUDA_VERSION"
echo "Virtual Env path: $VENV_DIR"
# will be executed when executing 'apptainer run xxx.sif`
%runscript
echo "Container was created $NOW"
echo "Ubuntu version: $OS_VERSION"
echo "CUDA version: $CUDA_VERSION"
echo "Virtual Env path: $VENV_DIR"
echo "Arguments received: $*"
### Execute the following args with the default program "python"
### --> `apptainer run <Image>.sif main.py`
echo "Executing '$VENV_DIR/bin/python $*'"
exec $VENV_DIR/bin/python $@
%labels
Author lijiaqi
%help
This is a .def file to create a python environment based on `nvidia/cuda` docker image.
then build with
sudo apptainer build --nv --build-arg CUDA_VERSION=12.6.0 apptainer-base.sif apptainer-base.def
Detailed example:
a .def
file for building from my customized Image ubuntu24.04-cuda-python3 Docker Hub, the modification is installing torch
into it.
# apptainer-detailed.def
# Header
Bootstrap: docker # from Docker Hub
From: lijiaqiisai/ubuntu24.04-cuda-python3:cuda
Stage: build
%arguments
CUDA_VERSION=12.6.0
%setup
# copy from Host path to Container path
%files
requirements.txt
%post
NOW=`date`
echo "export NOW=\"${NOW}\"" >> $APPTAINER_ENVIRONMENT
$VENV_DIR/bin/pip install -r requirements.txt
%test
grep -q NAME=\"Ubuntu\" /etc/os-release
if [ $? -eq 0 ]; then
echo "Container base is Ubuntu as expected."
else
echo "Container base is not Ubuntu."
exit 1
fi
%environment
export LISTEN_PORT=12345
export LC_ALL=C
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=${VENV_DIR}/bin:$PATH
%startscript
echo "Container was created $NOW"
echo "Ubuntu version: $OS_VERSION"
echo "CUDA version: $CUDA_VERSION"
echo "Virtual Env path: $VENV_DIR"
%runscript
echo "Container was created $NOW"
echo "Ubuntu version: $OS_VERSION"
echo "CUDA version: $CUDA_VERSION"
echo "Virtual Env path: $VENV_DIR"
echo "Arguments received: $*"
### Execute the following args with the default program "python"
### --> `apptainer run <Image>.sif main.py`
echo "Executing '$VENV_DIR/bin/python $*'"
exec $VENV_DIR/bin/python $@
%labels
Author lijiaqi
%help
This is a .def file to create a python environment based on `lijiaqiisai/ubuntu24.04-cuda-python3` docker image.
then build with
sudo apptainer build --nv --build-arg CUDA_VERSION=12.6.0 apptainer-detailed.sif apptainer-detailed.def
It will follows the requirements.txt
to build an Image file
Build/Pack from an existing sandbox
directory
sudo apptainer build <Image_name>.sif <SANDBOX_FOLDER>
What is the sandbox
directory <SANDBOX_FOLDER>
in the above example? Let’s move to next step.
Create Apptainer Image sandbox directory (then *.sif
) with --sandbox
option
By using the option --sandbox
when using apptainer build --sandbox <SANDBOX_FOLDER> <Image_name>.sif/<URL>
, we can create a mutable sandbox directory that we can customize to build our own virtual environment:
WARNING
It’s possible to create a sandbox without root privileges, but to ensure proper file permissions it is recommended to do so as root.
# create a sandbox directory
## from an existing *.sif
sudo apptainer build --nv --sandbox <SANDBOX_FOLDER>/ <Local>.sif
## from a remote docker hub
sudo apptainer build --nv --sandbox <SANDBOX_FOLDER>/ docker://<Image>:[Tag]
### e.g.
sudo apptainer build --nv --sandbox <SANDBOX_FOLDER>/ docker://lijiaqiisai/ubuntu24.04-cuda-python3:latest
Enter the sandbox container:
sudo apptainer shell --writable <SANDBOX_FOLDER>/
WARNING
Please note that with
--writable
mode, the nv files may not be bound.
Install softwares:
# (optional) Install Python
apt-get update && apt-get -y upgrade
apt-get -y install python3 python3-pip python3-virtualenv
# (optional) fix the issue from "nvidia/cuda" docker hub
# cd /etc/apt && cp trusted.gpg trusted.gpg.d && cd -
# (optional) Install torch==2.4.1
pip install torch==2.4.1 torchvision
Tip
Or if you want to install Python packages from a
requirements.txt
, please remember to bind its path to the Containersudo apptainer shell --writable -B <<host_path>>/requirements.txt:<<container_path>>/requirements.txt <SANDBOX_FOLDER>/ pip install -r <<container_path>>/requirements.txt
After install packages, you may want to pack the Sandbox Directory into a *.sif
Image file
# exit the container
exit
# package into .sif file
sudo apptainer build --nv <SANDBOX_ENV>.sif <SANDBOX_FOLDER>/
# (optional) clean the directories
rm -rf <SANDBOX_FOLDER>/
Discovery
I found that the Apptainer Image file
*.sif
seems smaller than the Docker Image when containing same packages. Not sure. Need to be verifyed.
Run a Container
After build a portable *.sif
Image file, you can use it for production at other place. There are several ways to run an instance based on your Apptainer Image file:
Command: apptainer run
executes the default runscripts defined in the container (e.g., see %runscript
in apptainer.def
)
# `apptainer run`if `%runscript` has been defined for this `*.sif` Image file (e.g., in the `apptainer.def` file)
apptainer run --nv dl-env.sif
Command: apptainer exec
followed by a specified command or program to execute within the container
# `apptainer exec` followed by your program
apptainer exec --nv dl-env.sif <your_program>
## e.g.,
apptainer exec --nv dl-env.sif python main.py
# can also run a specific shell; in this case, equivalent to `apptainer shell`
apptainer exec --nv dl-env.sif /bin/bash
Tip
Here is the comparisons between
apptainer run
andapptainer exec
Feature apptainer run
apptainer exec
Purpose Executes the default runscripts defined in the container Executes a specified command or program within the container Runscripts Requires that the container has a defined runscripts; will fail if none exists Does not rely on runscripts; you specify the command to run Usage Scenario Ideal for running pre-configured applications or workflows Useful for debugging, testing, or running specific commands Command Syntax apptainer run <container.sif>
apptainer exec <container.sif> <command>
Additional Options --nv
for GPU support--nv
for GPU support,--bind/-B
for binding host directoriesExample apptainer run example.sif
apptainer exec example.sif python script.py
Interactivity Does not support interactive shell unless the runscripts allow it Can launch an interactive shell (e.g., apptainer exec example.sif /bin/bash
)Flexibility Less flexible; limited to the runscripts defined in the image More flexible; allows execution of any command or script Error Handling Fails if no runscripts are defined Fails if the specified command is not found or fails to execute
Command: apptainer shell
run the container instance under the interactive mode.
Command: apptainer instance
apptainer instance [options] <command>
allows users to create, start, stop, and remove instances of containers, providing a way to run multiple instances of the same container image concurrently.
Some options:
-
start
: Start a new container instance (running in the backend)
apptainer instance start <Image_name>.sif <instance_name>
-
stop
: Stop a running container instance.
# stop a specific instance
apptainer instance stop <instance_name>
-
list
: List all currently running instances.
apptainer instance list
(Personal) Some notes for runing Apptainer on Compute Canada
References:
This part provides some best practices for using Apptainer
on Compute Canada.
apptainer exec --nv dl-env.sif my_script.sh
When use run, shell, instance, exec
commands on Compute Canada:
- Always use one of
-C, -c
or-e
options:-
-C
: hides filesystems, PID, IPC, and environment; -
-c
: use minimal\dev
, shared-with-host directories will appear empty, e.g.,\tmp
, unless explicitly bind mounted; -
-e
: clean environment before running container;
-
- Always use the
-W dir
option withdir
being a path to a real directory that you have write-access to- In
sbatch
scripts, set-W $SLURM_TMPDIR
- In
-
When using NVIDIA GPUs, use
-nv
to expose the NVIDIA hardwares to the container. - When access to host directories is needed, bind mount the top-level directories of those filesystem, or, the desired directories themselves.
- useful bind mounts:
-B /home -B /project -B /scratch
- useful bind mounts:
# an general example
apptainer exec -C --nv -B /home -B /project -B /scratch dl-env.sif -W $SLURM_TMPDIR my_program
# for commands with `srun`, e.g., MIP program
srun apptainer run dl-env.sif /path/to/your/mpi-program
Summary
A quick overview and comparison between Docker and Apptainer
Here is a take-away summary of two platforms:
Feature | Docker | Apptainer (previously-known as Singularity) |
---|---|---|
Image | ✓ | ✘ |
Container | ✓ | ✓ (what we will run is a Container) |
Build Image by pull | docker pull --name=[Local_Image:Tag] <Image>:[Tag] | SIF Image: apptainer pull/build <Image>.sif docker://<Image>:[Tag] or Sandbox: apptainer build --sandbox <SANDBOX_DIR> /docker://<Image>:[Tag] |
Build Image by from existing Image | N/A??? (need several steps, i.e., docker run to creat a Container from an Image –> make changes –> commit to a new Image with docker commit <Container_id> <New_Image>:<Tag> ) | apptainer build <New>.sif <Old>.sif (meanless, just like copy)apptainer build --sandbox <SANDBOX_DIR> <Old>.sif |
Build Image by a definition file | docker build -t/--tag [Img_Name][:[Tag]] . or docker build -t/--tag [Img_Name][:[Tag]] -f/--file ./Dockerfile | SIF Image: apptainer build *.sif apptainer.def or Sandbox: apptainer build --sandbox <SANDBOX_DIR> apptainer.def |
Start a New Container (instance) | docker run --name=<Container_name> <Image_name> | apptainer instance start *.sif <intance_name> |
Enter the Shell of a running Container | docker exec -it <container_id_or_name> /bin/bash | N/A (once the Container instance is created in the background, we cannot enter its interactive mode) |
Start a new Container as Interactive mode (combine the above two steps into one) | docker run -it --name=<container_name> <Image> | apptainer shell <Image>.sif |
Start an existing instance | docker start <instance_id_or_name> (should already exist) | N/A (since Apptainer instances will NOT be saved) |
Start an existing instance with Interactive mode (one-step) | docker start -it <instance_id_or_name> (should already exist) | N/A (since Apptainer instances will NOT be saved) |
Stop a running instance | docker stop <instance_id_or_name> | apptainer instance stop <instance> |
Pack into a new Image after modification | docker commit -m="xxx" -a="xx" <Container_id> <Image>:<Tag> | apptainer build --nv <New>.sif <SANDBOX_DIR> |
Check running Container instances | docker ps (or docker ps -a for all including stopped ones) | apptainer instance list |
Check existing Images | docker images | N/A (just need to check *.sif files) |
Run an application program | default CMD command with docker run --rm <Image> or speific program with docker run --rm <Image> <your_program> | default %runscript with apptainer run *.sif or specific program with apptainer exec *.sif <your_program> |
Mount/Bind a directory (useful for running programs) | -v <host_path>:<container_path> | --band/-B host_path[:container_path[:options]] |
Use Nvidia GPUs in Container (useful for running programs) | --gpus=xxx | --nv |
set workspace folder (useful for running programs) | -w=<dir_on_container> | N/A (no such a concept) |
Core Commands
It is noteworthy that the names of the commands between Docker and Apptainer have some overlap but some of them have different functions. Here is a summary of some common commands within each platform:
Docker
- Build
-
docker pull
: pull an existing Image from remote registry (e.g., Docker Hub) -
docker build
: build an Image from aDockerfile
-
- Run
-
docker run
: create a Container from an Image. Will execute the following command according toCMD
andENTRYPOINT
in theDockerfile
;-
docker run -it
: interactive mode. Analogue toapptainer instance start <Image>.sif <container_name>
-
docker run -d
: detach mode (background). Analogue toapptainer shell <Image>.sif
-
-
docker exec
: Executes a command in an existing running container. Will not create a new containerdocker exec <container_id_or_name> <your_program>
-
docker start
: start an existing Container (it may be closed on the background) -
docker stop
: stop a running Container (if it is not running with the interactive mode and you cannot close it withexit
)
-
Apptainer
- Build
-
apptainer pull
: pull an existing Image from remote registry likedocker://xxx
(Docker Hub) -
apptainer build
: build (1).sif
Image file; or (2) Sandbox directories with--sandbox
from (1) remote registry (e.g., Docker Hub) or (2) an existing local*.sif
file; (1)-(1) will align with the behaviour ofapptainer pull [Local].sif docker://xxx
-
- Run
-
apptainer run
: only for executing the default command defined in%runscript
inapptainer.def
-
apptainer exec
: launch a Container instance to execut a program -
apptainer shell
: build an Container instance with interactive mode -
apptainer instance
: oprations w.r.t. the Container instances-
apptainer instance start
: start a Container instance -
apptainer instance stop
: stop a Container instance -
apptainer instance list
: list all running Container instances
-
-
Enjoy Reading This Article?
Here are some more articles you might like to read next: