A container is basically an operating system within a file: by including all the operating system support files, software inside of it can run (almost) anywhere. This is great for things like clusters, where the operating system has to be managed very conservatively yet users have all sorts of bleeding-edge needs.
The downside is that it’s another thing to understand and manage. Luckily, most of the time containers for the software already exists, and using them is not much harder than other shell scripting.
What are containers?¶
As stated above, the basic idea is that software is packaged into a
container which basically contains the entire operating system. This
is done via a image definition file (
.def) which is itself interesting because it
contains a script that makes the whole image automatically - which
makes it reproducible and shareable. The image itself is the data
which contains the operating system and software.
During runtime, the root file system
/ is used from inside the
image and other file systems (
/home, etc.) can be
brought into the container through bind mounts. Effectively, the
programs in the container are run in an environment mostly defined by
the container image, but the programs can read and write specific
files in Triton - all the data you need to operate on. Typically,
e.g. the home directory comes from Triton.
This sounds complicated, but in practice it is not too hard once you
see an example and can copy the commands to run. For images managed
by Triton admins themselves, this is easy due to
singularity_wrapper tool we have written for Triton. You can also
run singularity on triton without the wrapper, but you may need to
/scratch yourself to access your data.
The hardest part of using containers is keeping track of files inside
vs outside: You specify a command that gets run inside the container
image. It mostly accesses files inside the image, but it can access
files outside if you bind-mount them in. If you ever get confused,
singularity shell (see below) to enter the container and see
what is going on.
Docker is the most commonly talked about container runtime, but most clusters use Singularity. The following table should make the reasons clear:
Designed for infrastructure deployment
Designed for scientific computing
Operating system service
In practice, gives root access to whole system
Does not give or need extra permissions to the system
Images stored in layers in hidden operating system locations opaquely managed through some commands.
One image is one
Docker is still a standard image format, and there are ways to convert images between the formats. In practice, if you can use Docker, you can also use Singularity by converting your image (commands on this page) and running it by copying other commands on this page.
Singularity with Triton’s pre-created modules¶
Some of the Triton modules automatically activate a Singularity image.
On Triton, you just need to load the proper module. This will set
some environment variables and enable the use of
singularity_wrapper (to see how it works, check
While the image itself is read-only, remember that
/l etc. are not. If you edit/remove files in
these locations within the image, that will happen outside the image
singularity_wrapper is written so that when you load a module written
for a singularity image, all the important options are already handled
for you. It has three basic commands:
singularity_wrapper shell [SHELL]- Gives user a shell within the image (specify
[SHELL]to say which shell you want).
singularity_wrapper exec CMD- Executes a program within the image.
singularity_wrapper run PARAMETERS- Runs the singularity image. What this means depends on the image in question - each image will define a “run command” which does something. If you don’t know what this is, use the first two instead.
Under the hood,
singularity_wrapper does this:
Choosing appropriate image based on module version
Binding of basic paths (
Loading of system libraries within images (if needed) (e.g.
Setting working directory within image (if needed)
This section describes using Singularity directly, with you managing the image file and running it.
Convert a Docker image to a Singularity image¶
If you have a Docker image, it has to be on a registry somewhere
(since they don’t exist as standalone files). You can pull to
convert it to a
.sif file (remember to change to a scratch folder
with plenty of space first):
$ cd $WRKDIR $ singularity build IMAGE_OUTPUT.sif docker://GROUP/IMAGE_NAME:VERSION
This will store the Docker layers in
which can result in running out of quota in your home folder.
In a situation like this, you can then clean the cache with:
singularity cache clean
You can also use another folder for your singularity cache by setting
SINGULARITY_CACHEDIR-variable. For example, you can set it to
a subfolder of your
export SINGULARITY_CACHEDIR=$WRKDIR/singularity_cache mkdir $SINGULARITY_CACHEDIR
Create your own image¶
See the Singularity docs on this.
You create a Singularity definition file
NAME.def, and then:
$ singularity build IMAGE_OUTPUT.sif NAME.def
These are the “raw” singularity commands. If you use these, you have
to configure the images and bind mounts yourself (which is done
singularity_wrapper). If you
NAME on a singularity module, you will get hints about what happens.
singularity shell IMAGE_FILE.sifwill start a shell inside of the image. This is great for understanding what the image does.
singularity exec IMAGE_FILE.sif COMMANDwill run COMMAND inside of the image. This is how you would script it for batch jobs, etc.
singularity run IMAGE_FILE.sifis a lot like
exec, but will run some pre-configured command (defined as part of the image definition). This might be useful when using a pre-made image. If you make an image executable, you can do this by running the image directly:
The extra arguments
--bind=/m,/l,/scratchwill make the import Triton data filesystems available inside of the container.
$HOMEhappens by default. You may want to add
$PWDfor your current working directory.
--nvprovides GPU access (though sometimes more is needed).
Batch script using singularity
#!/bin/bash #SBATCH --mem=10G #SBATCH --cpus-per-task=4 # We would run `python /path/to/software/in-image.py $WRKDIR/my-input-file`, so instead we run this inside the image. srun singularity exec --bind /scratch YOUR_IMAGE.sif python /path/to/software/in-image.py $WRKDIR/my-input-file
Writable container image that can be updated
Sometimes, it is too much work to completely define an image before
building it: it is more convenient to incrementally update it, just
like your own computer. You can make a writeable image directory using
singularity build --sandbox and then when you run it you can make permanent
changes to it by running with
--writeable. You could, for example, pull a Ubuntu image and
then slowly install things in it.
But note these disadvantages:
The image isn’t reproducible: you don’t have the definition file to make it, so if it gets messed up you can’t go back. Being able to delete and reproduce is very useful.
There isn’t an efficient, single-file image: instead, there are tens of thousands of files in a directory. You get the problems of many small files. If you run this many times, use
singularity build SINGLE_FILE.sif WRITEABLE_DIRECTORY_IMAGE/to convert it to a single file.
MPI in singularity
The Serpent code is a Hybrid MPI/OpenMP particle following code, and can be installed into a container using the definition file sss2.def, which creates a container based on Ubuntu v. 20.04. In the build process, Singularity clones the Serpent source code, installs the required compilers and libraries, including the MPI library to the container. Furthermore, datafiles needed by Serpent are included in the container. Finally, a python environment with useful tools are also installed into the container. The Serpent code is compiled and the executable binaries are saved and the source code is removed.
The container can be directly used with the Triton queue system
assuming the datafiles are stored in the user home folder. The file
can be used as an example. If scratch is used, please add
/scratch after “exec” in the file.
The key observations to make:
mpirunis called in Triton, which launches multiple Singularity containers (one for each MPI task). Each container directly launches the
`sss2`-executable. Each container can run multiple OpenMP threads of Serpent.
The openMPI library (v. 4.0.3) shipping with Ubuntu 20.04 seems to be compatible with the Triton module
The Ubuntu MPI library binds all the threads to the same CPU. This is avoided by passing the parameter
--bind-to noneto mpirun.
The infiniband is made available by the mpirun parameter
Singularity documentation: https://docs.sylabs.io/
Singularity docs on building a container: https://docs.sylabs.io/guides/latest/user-guide/build_a_container.html
Singularity documentation from Sigma2 (Norway): https://documentation.sigma2.no/software/containers.html