Summary and Setup
This lesson provides an introduction to using software containers with the goal of using them to effect reproducible computational environments. Such environments are useful for ensuring reproducible research outputs, for example.
This lesson will introduce both Docker and Singularity as tools for running containers. Singularity is particularly suited to running containers on infrastructure where users don’t have administrative privileges, for example shared infrastructure such as High Performance Computing (HPC) clusters.
Prerequisites
- You should have basic familiarity with using a command shell, and
the lesson text will at times request that you “open a shell window”,
with an assumption that you know what this means.
- Under Linux or macOS it is assumed that you will access a
bash
shell (usually the default), using your Terminal application. - Under Windows, Powershell and Git Bash should allow you to use the
Unix instructions. We will also try to give command variants for Windows
cmd.exe
.
- Under Linux or macOS it is assumed that you will access a
- The lessons will sometimes request that you use a text editor to
create or edit files in particular directories. It is assumed that you
either have an editor that you know how to use that runs within the
working directory of your shell window (e.g.
nano
), or that if you use a graphical editor, that you can use it to read and write files into the working directory of your shell. - You will need access to a local or remote platform with Singularity
pre-installed and accessible to you as a user (i.e. no
administrator/root access required).
- If you are attending a taught version of this material, it is expected that the course organisers will provide access to a platform (e.g. an institutional HPC cluster) that you can use for these sections of the material.
- You should be familiar with the basic commands for running jobs on the HPC platform you will be using.
Target audience
This lesson on the use of containers is intended to be relevant to a wide range of researchers, as well as existing and prospective technical professionals. It is intended as a beginner level course that is suitable for people who have no experience of containers.
We are aiming to help people who want to develop their knowledge of container tooling to help improve reproducibility and support their research work, or that of individuals or teams they are working with.
Access to CREATE HPC
You will need an account on the CREATE HPC. If you do not already have one, please request one following the instructions on the e-Research docs pages.
Website accounts to create
Please seek help at the start of the lesson if you have not been able to establish a website account on:
- The Docker Hub. We will use the Docker Hub to download pre-built container images, and for you to upload and download container images that you create, as explained in the relevant lesson episodes.
Files to download
Download the container-intro.zip
file. This file can alternatively be downloaded from the
files
directory in the singularity-introduction
GitHub repository.
Move the downloaded file to your Desktop and unzip it. It should
unzip to a folder called container-intro
.
Software to install
Docker’s installation experience has steadily improved, however situations will arise in which installing Docker on your computer may not be straightforward unless you have a large amount of technical experience. Workshops try to have helpers on hand that have worked their way through the install process, but do be prepared for some troubleshooting.
In most cases, you will need to have administrator rights on the computer in order to install the Docker software. If you are using a computer managed by your organisation and do not have administrator rights, you may be able to get your organisation’s IT staff to install Docker for you. Alternatively your IT support staff may be able to give you remote access to a server that can run Docker commands.
Please try to install the appropriate software from the list below depending on the operating system that your computer is running. Do let the workshop organisers know as early as possible if you are unable to install Docker using these instructions, as there may be other options available.
Verify Installation
To quickly check if the Docker and client and server are working run the following command in a new terminal or ssh session:
OUTPUT
Client:
Version: 20.10.2
API version: 1.41
Go version: go1.13.8
Git commit: 20.10.2-0ubuntu2
Built: Tue Mar 2 05:52:27 2021
OS/Arch: linux/arm64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.2
API version: 1.41 (minimum version 1.12)
Go version: go1.13.8
Git commit: 20.10.2-0ubuntu2
Built: Tue Mar 2 05:45:16 2021
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.4.4-0ubuntu1
GitCommit:
runc:
Version: 1.0.0~rc95-0ubuntu1~21.04.1
GitCommit:
docker-init:
Version: 0.19.0
GitCommit:
The above output shows a successful installation and will vary based
on your system. The important part is that the “Client” and the “Server”
parts are both working and returns information. It is beyond the scope
of this document to debug installation problems but common errors
include the user not belonging to the docker
group and
forgetting to start a new terminal or ssh session.
Warning: Git Bash
If you are using Git Bash as your terminal on Windows then you should be aware that you may run into issues running some of the commands in this lesson as Git Bash will automatically re-write any paths you specify at the command line into Windows versions of the paths and this will confuse the Docker container you are trying to use. For example, if you enter the command:
docker run alpine cat /etc/os-release
Git Bash will change the /etc/os-release
path to
C:\etc\os-release\
before passing the command to the Docker
container and the container will report an error. If you want to use Git
Bash then you can request that this path translation does not take place
by adding an extra /
to the start of the path. i.e. the
command would become:
docker run alpine cat //etc/os-release
This should suppress the path translation functionality in Git Bash.
A quick tutorial on copy/pasting file contents from episodes of the lesson
Let’s say you want to copy text off the lesson website and paste it
into a file named myfile
in the current working directory
of a shell window. This can be achieved in many ways, depending on your
computer’s operating system, but routes I have found work for me:
- macOS and Linux: you are likely to have the
nano
editor installed, which provides you with a very straightforward way to create such a file, just runnano myfile
, then paste text into the shell window, and press control+x to exit: you will be prompted whether you want to save changes to the file, and you can type y to say “yes”. - Microsoft Windows running
cmd.exe
shells:-
del myfile
to removemyfile
if it already existed; -
copy con myfile
to mean what’s typed in your shell window is copied intomyfile
; - paste the text you want within
myfile
into the shell window; - type control+z and then press enter
to finish copying content into
myfile
and return to your shell; - you can run the command
type myfile
to check the content of that file, as a double-check.
-
- Microsoft Windows running PowerShell:
-
The
cmd.exe
method probably works, but another is to paste your file contents into a so-called “here-string” between@'
and'@
as in this example that follows (the “>” is the prompt indicator):> @' Some hypothetical file content that is split over many lines. '@ | Set-Content myfile -encoding ascii
-