This lesson is being piloted (Beta version)

Introduction to Docker

Introduction

Overview

Teaching: 5 min
Exercises: 5 min
Questions
  • What are containers?

Objectives
  • Understand the basics of images and containers.

Documentation

The official Docker documentation and tutorial can be found on the Docker website. It is quite thorough and useful. It is an excellent guide that should be routinely visited, but the emphasis of this introduction is on using Docker, not how Docker itself works.

A note up front, Docker has very similar syntax to Git and Linux, so if you are familiar with the command line tools for them then most of Docker should seem somewhat natural (though you should still read the docs!).

Docker logo

Docker Images and Containers

It is still important to know what Docker is and what the components of it are. Docker images are executables that bundle together all necessary components for an application or an environment. Docker containers are the runtime instances of images — they are images with a state and act as native Linux processes.

Importantly, containers share the host machine’s OS system kernel and so don’t require an OS per application. As discrete processes containers take up only as much memory as necessary, making them very lightweight and fast to spin up to run.

It is also worth noting that as images are executables that produce containers, the same image can create multiple container instances that are running simultaneously as different processes. If you think about other executables that can be run in multiple processes on your machine this is perhaps not too surprising.

Docker structure

Key Points

  • Images are series of zip files that act as templates for containers.

  • Containers are runtime instantiation of images — images with state and are native processes.


Pulling Images

Overview

Teaching: 10 min
Exercises: 5 min
Questions
  • How are images downloaded?

  • How are images distinguished?

Objectives
  • Pull images from Docker Hub image registry

  • List local images

  • Introduce image tags

Docker Hub

Much like GitHub allows for web hosting and searching for code, the Docker Hub image registry allows the same for Docker images. Hosting and building of images is free for public repositories and allows for downloading images as they are needed. Additionally, through integrations with GitHub and Bitbucket, Docker Hub repositories can be linked against Git repositories so that automated builds of Dockerfiles on Docker Hub will be triggered by pushes to repositories.

Pulling Images

To begin with we’re going to pull down the Docker image we’re going to be working in for the tutorial

docker pull matthewfeickert/intro-to-docker

and then list the images that we have available to us locally

docker images

If you have many images and want to get information on a particular one you can apply a filter, such as the repository name

docker images matthewfeickert/intro-to-docker
REPOSITORY                        TAG          IMAGE ID        CREATED        SIZE
matthewfeickert/intro-to-docker   latest       9602bb3f01a4    11 hours ago   1.54GB

or more explicitly

docker images --filter=reference="matthewfeickert/intro-to-docker"
REPOSITORY                        TAG          IMAGE ID        CREATED        SIZE
matthewfeickert/intro-to-docker   latest       9602bb3f01a4    11 hours ago   1.54GB

You can see here that there is the TAG field associated with the matthewfeickert/intro-to-docker image. Tags are way of further specifying different versions of the same image. As an example, let’s pull the bullseye release tag of the Debian image.

docker pull debian:bullseye
docker images debian
bullseye: Pulling from library/debian
<some numbers>: Pull complete
Digest: sha256:<the relevant SHA hash>
Status: Downloaded newer image for debian:bullseye
docker.io/library/debian:bullseye

REPOSITORY   TAG        IMAGE ID       CREATED        SIZE
debian       bullseye   f776cfb21b5e   5 days ago     124MB

Additionally, there might be times where the same image has different tags. For example, we can pull the bootcamp-2021 tag of the matthewfeickert/intro-to-docker image, but when we inspect it we see that it is the same image as the one we already pulled.

docker pull matthewfeickert/intro-to-docker:bootcamp-2021
docker images matthewfeickert/intro-to-docker
REPOSITORY                        TAG             IMAGE ID       CREATED        SIZE
matthewfeickert/intro-to-docker   bootcamp-2021   64708e04f3a9   11 hours ago   1.57GB
matthewfeickert/intro-to-docker   latest          64708e04f3a9   11 hours ago   1.57GB

Pulling Python

Pull the image for Python 3.9 and then list all python images along with the matthewfeickert/intro-to-docker image

Solution

docker pull python:3.9
docker images --filter=reference="matthewfeickert/intro-to-docker" --filter=reference="python"
REPOSITORY                        TAG                 IMAGE ID       CREATED         SIZE
matthewfeickert/intro-to-docker   bootcamp-2021       64708e04f3a9   11 hours ago    1.57GB
matthewfeickert/intro-to-docker   latest              64708e04f3a9   11 hours ago    1.57GB
python                            3.9                 e2d7fd224b9c   4 days ago      912MB

Key Points

  • Pull images with docker pull

  • List images with docker images

  • Image tags distinguish releases or version and are appended to the image name with a colon


Running Containers

Overview

Teaching: 15 min
Exercises: 5 min
Questions
  • How are containers run?

  • How do you monitor containers?

  • How are containers exited?

  • How are containers restarted?

Objectives
  • Run containers

  • Understand container state

  • Stop and restart containers

To use a Docker image as a particular instance on a host machine you run it as a container. You can run in either a detached or foreground (interactive) mode.

Run the image we pulled as an interactive container

docker run -ti matthewfeickert/intro-to-docker:latest /bin/bash

You are now inside the container in an interactive bash session. Check the file directory

pwd
/home/docker/data

and check the host to see that you are not in your local host system

hostname
<generated hostname>

Further, check the os-release to see that you are actually inside a release of Debian (given the Docker Library’s Python image Dockerfile choices)

cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Monitoring Containers

Open up a new terminal tab on the host machine and list the containers that are currently running

docker ps
CONTAINER ID        IMAGE         COMMAND             CREATED             STATUS              PORTS               NAMES
<generated id>      <image:tag>   "/bin/bash"         n minutes ago       Up n minutes                            <generated name>

Notice that the name of your container is some randomly generated name. To make the name more helpful, rename the running container

docker rename <CONTAINER ID> my-example

and then verify it has been renamed

docker ps
CONTAINER ID        IMAGE         COMMAND             CREATED             STATUS              PORTS               NAMES
<generated id>      <image:tag>   "/bin/bash"         n minutes ago       Up n minutes                            my-example

Renaming by name

You can also identify containers to rename by their current name

docker rename <NAME> my-example

Exiting and restarting containers

As a test, create a file in your container by printing the current datetime into a file

date > test.txt

In the container exit at the command line

exit

You are returned to your shell. If you list the containers you will notice that none are running

docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

but you can see all containers that have been run and not removed with

docker ps -a
CONTAINER ID        IMAGE         COMMAND             CREATED            STATUS                     PORTS               NAMES
<generated id>      <image:tag>   "/bin/bash"         n minutes ago      Exited (0) t seconds ago                       my-example

To restart your exited Docker container start it again and then attach it to your shell

docker start <CONTAINER ID>
docker attach <CONTAINER ID>

Starting and attaching by name

You can also start and attach containers by their name

docker start <NAME>
docker attach <NAME>

Notice that your entry point is still /home/docker/data and then check that your test.txt still exists with the datetime you printed into it

ls *.txt
test.txt
cat test.txt
Sun Oct 17 16:51:51 UTC 2021

So this shows us that we can exit Docker containers for arbitrary lengths of time and then return to our working environment inside of them as desired.

Clean up a container

If you want a container to be cleaned up — that is deleted — after you exit it then run with the --rm option flag

docker run --rm -ti <IMAGE> /bin/bash

Key Points

  • Run containers with docker run

  • Monitor containers with docker ps

  • Exit interactive sessions just as you would a shell

  • Restart stopped containers with docker start


File I/O with Containers

Overview

Teaching: 15 min
Exercises: 5 min
Questions
  • How do containers interact with my local file system?

Objectives
  • Better understand I/O with containers

Copying

Copying files between the local host and Docker containers is possible. On your local host find a file that you want to transfer to the container and then

touch io_example.txt
# If on Mac may need to do: chmod a+x io_example.txt
echo "This was written on local host" > io_example.txt
docker cp io_example.txt <CONTAINER ID>:/home/docker/data/

and then from the container check and modify it in some way

pwd
ls io_example.txt
cat io_example.txt
echo "This was written inside Docker" >> io_example.txt
/home/docker/data
io_example.txt
This was written on local host

and then on the local host copy the file out of the container

docker cp <CONTAINER ID>:/home/docker/data/io_example.txt .

and verify if you want that the file has been modified as you wanted

cat io_example.txt
This was written on local host
This was written inside Docker

Volume mounting

What is more common and arguably more useful is to mount volumes to containers with the -v flag. This allows for direct access to the host file system inside of the container and for container processes to write directly to the host file system.

docker run -v <path on host>:<path in container> <image>

For example, to mount your current working directory on your local machine to the data directory in the example container

docker run --rm -ti -v $PWD:/home/docker/data matthewfeickert/intro-to-docker:latest

From inside the container you can ls to see the contents of your directory on your local machine

ls

and yet you are still inside the container

pwd
/home/docker/data

You can also see that any files created in this path in the container persist upon exit

touch created_inside.txt
exit
ls *.txt
created_inside.txt

This I/O allows for Docker images to be used for specific tasks that may be difficult to do with the tools or software installed on only the local host machine. For example, debugging problems with software that arise on cross-platform software, or even just having a specific version of software perform a task (e.g., using Python 2 when you don’t want it on your machine, or using a specific release of TeX Live when you aren’t ready to update your system release).

Flag choices

What will be the result of running the following command?

docker run --rm -v $PWD:/home/docker/data matthewfeickert/intro-to-docker:latest

Solution

Outwardly it would appear that there is no affect! You are returned to your starting terminal. However, something did happen. Look again at the flags: --rm -v …but no -ti for interactive. So the container got spun up by docker run, wasn’t given any command and and so executed a Bash shell with /bin/bash, wasn’t put into an interactive state and finished, and then cleaned itself up with --rm.

Running Jupyter from a Docker Container

You can run a Jupyter server from inside of your Docker container. First run a container (with Jupyter installed) while exposing the container’s internal port 8888 with the -p flag

docker run --rm -ti -p 8888:8888 matthewfeickert/intro-to-docker:latest /bin/bash

Then start a Jupyter server with the server listening on all IPs

jupyter lab --allow-root --no-browser --ip 0.0.0.0

though for your convince the example container has been configured with these default settings so you can just run

jupyter lab

Finally, copy and paste the following with the generated token from the server as <token> into your web browser on your local host machine

http://localhost:8888/?token=<token>

You now have access to Jupyter running on your Docker container.

Key Points

  • Learn how docker cp works

  • Learn about volume mounts

  • Show port forwarding of applications


Writing Dockerfiles and Building Images

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • How are Dockerfiles written?

  • How are Docker images built?

Objectives
  • Write simple Dockerfiles

  • Build a Docker image from a Dockerfile

Docker images are built through the Docker engine by reading the instructions from a Dockerfile. These text based documents provide the instructions though an API similar to the Linux operating system commands to execute commands during the build. The Dockerfile for the example image being used is an example of some simple extensions of the official Python 3.9 Docker image based on the Debian Bullseye OS (python:3.9-bullseye).

As a very simple of extending the example image into a new image create a Dockerfile on your local machine

touch Dockerfile

and then write in it the Docker engine instructions to add cowsay and scikit-learn to the environment

# Dockerfile
FROM matthewfeickert/intro-to-docker:latest

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -qq -y install cowsay && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/* && \
    ln -s /usr/games/cowsay /usr/bin/cowsay

RUN python -m pip install --no-cache-dir --quiet scikit-learn

USER docker

Dockerfile layers

Each RUN command in a Dockerfile creates a new layer to the Docker image. In general, each layer should try to do one job and the fewer layers in an image the easier it is compress. When trying to upload and download images on demand the smaller the size the better.

Don’t run as root

By default Docker containers will run as root. This is a bad idea and a security concern. Instead, setup a default user (like docker in the example) and if needed give the user greater privileges.

Then build an image from the Dockerfile and tag it with a human readable name

docker build . -f Dockerfile -t extend-example:latest

You can now run the image as a container and verify for yourself that your additions exist

docker run --rm -ti extend-example:latest /bin/bash
which cowsay
cowsay "Hello from Docker"
pip list | grep scikit
python -c 'import sklearn as sk; print(sk)'
/usr/bin/cowsay
 ___________________
< Hello from Docker >
 -------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

scikit-learn        1.0
<module 'sklearn' from '/usr/local/lib/python3.9/site-packages/sklearn/__init__.py'>

Build context matters!

In the docker build command the first argument we passed, ., was the “build context”. The build context is the set of files that Docker build is aware of when it starts the build. The . meant that the build context was set to the contents of the current working directory. Importantly, the entire build context is sent to the Docker daemon at build which means that any and all files in the build context will be copied and sent. Given this, you want to make sure that your build context doesn’t contain any unnecessary large files, or use a .dockerignore file to hide them.

Beyond the basics

ARGs and ENVs

Even though the Dockerfile is a set of specific build instructions to the Docker engine it can still be scripted to give greater flexibility by using the Dockerfile ARG instruction and the build-arg flag to define build time variables to be evaluated. For example, consider an example Dockerfile with a configurable FROM image

# Dockerfile.arg-py3
# Make the base image configurable
ARG BASE_IMAGE=python:3.9-bullseye
FROM ${BASE_IMAGE}

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/*

RUN python -m pip --no-cache-dir install --upgrade pip setuptools wheel && \
    python -m pip --no-cache-dir install --quiet scikit-learn

# Create user "docker"
RUN useradd -m docker && \
    cp /root/.bashrc /home/docker/ && \
    mkdir /home/docker/data && \
    chown -R --from=root docker /home/docker

WORKDIR /home/docker/data
USER docker

and then build it using the python:3.8 image as the FROM image

docker build -f Dockerfile.arg-py3 --build-arg BASE_IMAGE=python:3.8 -t arg-example:py-38 .

which you can check has Python 3.8 and not Python 3.9

docker run --rm -ti arg-example:py-38 /bin/bash
which python
python --version
/usr/local/bin/python3

Python 3.8.11

Default ARG values

Setting the value of the ARG inside of the Dockerfile allows for default values to be used if no --build-arg is passed to docker build.

ENV variables are similar to ARG variables, except that they persist past the build stage and still accessible in the container runtime. Think of using ENV in a similar manner to how you would use export in Bash.

# Dockerfile.arg-py3
# Make the base image configurable
ARG BASE_IMAGE=python:3.9-bullseye
FROM ${BASE_IMAGE}

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/*

RUN python -m pip --no-cache-dir install --upgrade pip setuptools wheel && \
    python -m pip --no-cache-dir install --quiet scikit-learn

# Create user "docker"
RUN useradd -m docker && \
    cp /root/.bashrc /home/docker/ && \
    mkdir /home/docker/data && \
    chown -R --from=root docker /home/docker

# Use C.UTF-8 locale to avoid issues with ASCII encoding
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8

ENV HOME /home/docker
WORKDIR ${BASE_IMAGE}/data
USER docker
docker build -f Dockerfile.arg-py3 --build-arg BASE_IMAGE=python:3.8 -t arg-example:latest .
docker run --rm -ti arg-example:latest /bin/bash
echo $HOME
echo $LC_ALL
/home/docker

C.UTF-8

whereas for

docker run --rm -ti arg-example:py-38 /bin/bash
echo $HOME
echo $LC_ALL
/home/docker


Tags

In the examples so far the built image has been tagged with a single tag (e.g. latest). However, tags are simply arbitrary labels meant to help identify images and images can have multiple tags. New tags can be specified in the docker build command by giving the -t flag multiple times or they can be specified after an image is built by using docker tag.

docker tag <SOURCE_IMAGE[:TAG]> <TARGET_IMAGE[:TAG]>

Add your own tag

Using docker tag add a new tag to the image you built.

Solution

docker images extend-example
docker tag extend-example:latest extend-example:my-tag
docker images extend-example
REPOSITORY          TAG                 IMAGE ID            CREATED            SIZE
extend-example      latest              b571a34f63b9        t seconds ago      1.59GB

REPOSITORY          TAG                 IMAGE ID            CREATED            SIZE
extend-example      latest              b571a34f63b9        t seconds ago      1.59GB
extend-example      my-tag              b571a34f63b9        t seconds ago      1.59GB

Tags are labels

Note how the image ID didn’t change for the two tags: they are the same object. Tags are simply convenient human readable labels.

COPY

Docker also gives you the ability to copy external files into a Docker image during the build with the COPY Dockerfile command. Which allows copying a target file on the from a host file system into the Docker image file system

COPY <path on host> <path in Docker image>

For example, if there is a file called requirements.txt in the same directory as the build is executed from

touch requirements.txt

with contents

cat requirements.txt
scikit-learn==1.0

then this could be copied into the Docker image of the previous example during the build and then used (and then removed as it is no longer needed) with the following

# Dockerfile.arg-py3
# Make the base image configurable
ARG BASE_IMAGE=python:3.9-bullseye
FROM ${BASE_IMAGE}

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/*

COPY requirements.txt requirements.txt

RUN python -m pip --no-cache-dir install --upgrade pip setuptools wheel && \
    python -m pip --no-cache-dir install --requirement requirements.txt && \
    rm requirements.txt

# Create user "docker"
RUN useradd -m docker && \
    cp /root/.bashrc /home/docker/ && \
    mkdir /home/docker/data && \
    chown -R --from=root docker /home/docker

# Use C.UTF-8 locale to avoid issues with ASCII encoding
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8

ENV HOME /home/docker
WORKDIR ${BASE_IMAGE}/data
USER docker
docker build -f Dockerfile.arg-py3 --build-arg BASE_IMAGE=python:3.8 -t arg-example:latest .

For very complex scripts or files that are on some remote, COPY offers a straightforward way to bring them into the Docker build.

Using COPY

Write a Bash script that installs the tree utility and then use COPY to add it to your Dockerfile and then run it during the build

Solution

touch install_tree.sh
#install_tree.sh
#!/usr/bin/env bash

set -e

apt-get update -y
apt-get install -y tree
# Dockerfile.arg-py3
# Make the base image configurable
ARG BASE_IMAGE=python:3.9-bullseye
FROM ${BASE_IMAGE}

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/*

COPY requirements.txt requirements.txt

RUN python -m pip --no-cache-dir install --upgrade pip setuptools wheel && \
    python -m pip --no-cache-dir install --requirement requirements.txt && \
    rm requirements.txt

COPY install_tree.sh install_tree.sh
RUN bash install_tree.sh && \
    rm install_tree.sh

# Create user "docker"
RUN useradd -m docker && \
    cp /root/.bashrc /home/docker/ && \
    mkdir /home/docker/data && \
    chown -R --from=root docker /home/docker

# Use C.UTF-8 locale to avoid issues with ASCII encoding
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8

ENV HOME /home/docker
WORKDIR ${BASE_IMAGE}/data
USER docker
docker build -f Dockerfile.arg-py3 --build-arg BASE_IMAGE=python:3.8 -t arg-example:latest .
docker run --rm -ti arg-example:latest /bin/bash
cd
which tree
tree
/usr/bin/tree

.
└── data

1 directory, 0 files

Key Points

  • Dockerfiles are written as text file commands to the Docker engine

  • Docker images are built with docker build

  • Built time variables can be defined with ARG and set with --build-arg

  • ENV arguments persist into the container runtime

  • Docker images can have multiple tags associated to them

  • Docker images can use COPY to copy files into them during build


Removal of Containers and Images

Overview

Teaching: 5 min
Exercises: 5 min
Questions
  • How do you cleanup old containers?

  • How do you delete images?

Objectives
  • Learn how to cleanup after Docker

You can cleanup/remove a container docker rm

docker rm <CONTAINER ID>

Remove old containers

Start an instance of the tutorial container, exit it, and then remove it with docker rm

Solution

docker run matthewfeickert/intro-to-docker:latest
docker ps -a
docker rm <CONTAINER ID>
docker ps -a
CONTAINER ID        IMAGE         COMMAND             CREATED            STATUS                     PORTS               NAMES
<generated id>      <image:tag>   "/bin/bash"         n seconds ago      Exited (0) t seconds ago                       <name>

<generated id>

CONTAINER ID        IMAGE         COMMAND             CREATED            STATUS                     PORTS               NAMES

You can remove an image from your computer entirely with docker rmi

docker rmi <IMAGE ID>

Remove an image

Pull down the Python 2.7 image from Docker Hub and then delete it.

Solution

docker pull python:2.7
docker images python
docker rmi <IMAGE ID>
docker images python
2.7: Pulling from library/python
<some numbers>: Pull complete
<some numbers>: Pull complete
<some numbers>: Pull complete
<some numbers>: Pull complete
<some numbers>: Pull complete
<some numbers>: Pull complete
<some numbers>: Pull complete
<some numbers>: Pull complete
Digest: sha256:<the relevant SHA hash>
Status: Downloaded newer image for python:2.7
docker.io/library/python:2.7

REPOSITORY   TAG                 IMAGE ID       CREATED         SIZE
python       3.8                 79372a158581   4 days ago      909MB
python       3.9                 e2d7fd224b9c   4 days ago      912MB
python       3.9-bullseye        e2d7fd224b9c   4 days ago      912MB
python       2.7                 68e7be49c28c   18 months ago   902MB

Untagged: python@sha256:<the relevant SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>
Deleted: sha256:<layer SHA hash>

REPOSITORY   TAG                 IMAGE ID       CREATED         SIZE
python       3.8                 79372a158581   4 days ago      909MB
python       3.9                 e2d7fd224b9c   4 days ago      912MB
python       3.9-bullseye        e2d7fd224b9c   4 days ago      912MB

Helpful cleanup commands

What is helpful is to have Docker detect and remove unwanted images and containers for you. This can be done with prune, which depending on the context will remove different things.

  • docker container prune removes all stopped containers, which is helpful to clean up forgotten stopped containers.
  • docker image prune removes all unused or dangling images (images that do not have a tag). This is helpful for cleaning up after builds.
  • docker system prune removes all stopped containers, dangling images, and dangling build caches. This is very helpful for cleaning up everything all at once.

Worth cleaning up to save disk

Docker images are just lots of compressed archives, and they can take up lots of disk pretty fast. You can monitor the total disk being used by Docker with df

docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              n                   0                   X.YGB               X.YGB (100%)
Containers          0                   0                   0B                  0B
Local Volumes       m                   0                   A.BkB               A.BkB (100%)
Build Cache         0                   0                   0B                  0B

and for lots of detailed information you can use the -v verbose flag

docker system df -v

Key Points

  • Remove containers with docker rm

  • Remove images with docker rmi

  • Perform faster cleanup with docker container prune, docker image prune, and docker system prune


Coffee break

Overview

Teaching: 0 min
Exercises: 15 min
Questions
  • Coffee or tea?

Objectives
  • Refresh your mental faculties with coffee and conversation

coffee nyan-whale

Key Points

  • Breaks are helpful in the service of learning


Using CMD and ENTRYPOINT in Dockerfiles

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • How are default commands set in Dockerfiles?

Objectives
  • Learn how and when to use CMD

  • Learn how and when to use ENTRYPOINT

So far everytime we’ve run the Docker containers we’ve typed

docker run --rm -ti <IMAGE>:<TAG> <command>

like

docker run --rm -ti python:3.9 /bin/bash

Running this dumps us into a Bash session

printenv | grep SHELL
SHELL=/bin/bash

However, if no /bin/bash is given then you are placed inside the Python 3.9 REPL.

docker run --rm -ti python:3.9
Python 3.9.7 (default, Oct 13 2021, 09:00:49)
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

These are very different behaviors, so let’s understand what is happening.

The Python 3.9 Docker image has a default command that runs when the container is executed, which is specified in the Dockerfile with CMD.

# Dockerfile.defaults
# Make the base image configurable
ARG BASE_IMAGE=python:3.9-bullseye
FROM ${BASE_IMAGE}

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/*

# Create user "docker"
RUN useradd -m docker && \
    cp /root/.bashrc /home/docker/ && \
    mkdir /home/docker/data && \
    chown -R --from=root docker /home/docker

ENV HOME /home/docker
WORKDIR ${HOME}/data
USER docker

CMD ["/bin/bash"]
docker build -f Dockerfile.defaults -t defaults-example:latest --compress .

Now running

docker run --rm -ti defaults-example:latest

again drops you into a Bash shell as specified by CMD. As has already been seen, CMD can be overridden by giving a command after the image

docker run --rm -ti defaults-example:latest python

The ENTRYPOINT builder command allows to define a command or commands that are always run at the “entry” to the Docker container. If an ENTRYPOINT has been defined then CMD provides optional inputs to the ENTRYPOINT.

# entrypoint.sh
#!/usr/bin/env bash

set -e

function main() {
    if [[ $# -eq 0 ]]; then
        printf "\nHello, World!\n"
    else
        printf "\nHello, %s!\n" "${1}"
    fi
}

main "$@"

/bin/bash
# Dockerfile.defaults
# Make the base image configurable
ARG BASE_IMAGE=python:3.9-bullseye
FROM ${BASE_IMAGE}

USER root

RUN apt-get -qq -y update && \
    apt-get -qq -y upgrade && \
    apt-get -y autoclean && \
    apt-get -y autoremove && \
    rm -rf /var/lib/apt/lists/*

# Create user "docker"
RUN useradd -m docker && \
    cp /root/.bashrc /home/docker/ && \
    mkdir /home/docker/data && \
    chown -R --from=root docker /home/docker

ENV HOME /home/docker
WORKDIR ${HOME}/data
USER docker

COPY entrypoint.sh $HOME/entrypoint.sh
ENTRYPOINT ["/bin/bash", "/home/docker/entrypoint.sh"]
CMD ["Docker"]
docker build -f Dockerfile.defaults -t defaults-example:latest --compress .

So now

docker run --rm -ti defaults-example:latest

Hello, Docker!
docker@2a99ffabb512:~/data$

Applied ENTRYPOINT and CMD

What will be the output of

docker run --rm -ti defaults-example:latest $USER

and why?

Solution


Hello, <your user name>!
docker@2a99ffabb512:~/data$

$USER is evaluated and then overrides the default CMD to be passed to entrypoint.sh

Key Points

  • CMD provide defaults for an executing container

  • CMD can provide options for ENTRYPOINT

  • ENTRYPOINT allows you to configure commands that will always run for an executing container


Build Docker Images with GitLab CI

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • How can CI be used to build Docker images?

Objectives
  • Build Docker images using GitLab CI

  • Deploy Docker images to GitLab Registry

  • Pull Docker images from GitLab Registry

We know how to build Docker images, but it would be better to be able to build many images quickly and not on our own computer. CI can help here.

First, create a new repository in GitLab. You can call it whatever you like, but as an example we’ll used “build-with-ci-example”.

Clone the repo

Clone the repo down to your local machine and navigate into it.

Solution

Get the repo URL from the project GitLab webpage

git clone <repo URL>
cd build-with-ci-example

Add a feature branch

Add a new feature branch to the repo for adding CI

Solution

git checkout -b feature/add-CI

Add a Dockerfile

Add an example Dockerfile that uses ENTRYPOINT

Solution

Write and commit a Dockerfile like

# Make the base image configurable
ARG BASE_IMAGE=python:3.7
FROM ${BASE_IMAGE}
USER root
RUN apt-get -qq -y update && \
   apt-get -qq -y upgrade && \
   apt-get -y autoclean && \
   apt-get -y autoremove && \
   rm -rf /var/lib/apt/lists/*
# Create user "docker"
RUN useradd -m docker && \
   cp /root/.bashrc /home/docker/ && \
   mkdir /home/docker/data && \
   chown -R --from=root docker /home/docker
ENV HOME /home/docker
WORKDIR ${HOME}/data
USER docker

COPY entrypoint.sh $HOME/entrypoint.sh
ENTRYPOINT ["/bin/bash", "/home/docker/entrypoint.sh"]
CMD ["Docker"]

Add an entry point script

Write and commit a Bash script to be run as ENTRYPOINT

Solution

Make a file named entrypoint.sh that contains

#!/usr/bin/env bash

set -e

function main() {
   if [[ $# -eq 0 ]]; then
       printf "\nHello, World!\n"
   else
       printf "\nHello, %s!\n" "${1}"
   fi
}

main "$@"

/bin/bash

Required GitLab YAML

To build images using GitLab CI jobs the kaniko tool from Google is used. Kaniko jobs on CERN’s Enterprise Edition of GitLab expect some “boiler plate” YAML to run properly

- build

.build_template:
stage: build
image:
  # Use CERN version of the Kaniko image
  name: gitlab-registry.cern.ch/ci-tools/docker-image-builder
  entrypoint: [""]
variables:
  DOCKERFILE: <Dockerfile path>
  BUILD_ARG_1: <argument to the Dockerfile>
  TAG: latest
  # Use single quotes to escape colon
  DESTINATION: '${CI_REGISTRY_IMAGE}:${TAG}'
before_script:
  # Prepare Kaniko configuration file
  - echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
script:
  - printenv
  # Build and push the image from the given Dockerfile
  # See https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#variables-reference for available variables
  - /kaniko/executor --context $CI_PROJECT_DIR
    --dockerfile ${DOCKERFILE}
    --build-arg ${BUILD_ARG_1}
    --destination ${DESTINATION}
only:
  # Only build if the generating files change
  changes:
    - ${DOCKERFILE}
    - entrypoint.sh
except:
  - tags

Python 3.7 default build

Revise this to build from the Python 3.7 image for the Dockerfile just written

Solution

Make a file named entrypoint.sh that contains

stages:
 - build

.build_template:
 stage: build
 image:
   # Use CERN version of the Kaniko image
   name: gitlab-registry.cern.ch/ci-tools/docker-image-builder
   entrypoint: [""]
 variables:
   DOCKERFILE: Dockerfile
   BUILD_ARG_1: BASE_IMAGE=python:3.7
   TAG: latest
   # Use single quotes to escape colon
   DESTINATION: '${CI_REGISTRY_IMAGE}:${TAG}'
 before_script:
   # Prepare Kaniko configuration file
   - echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
 script:
   - printenv
   # Build and push the image from the given Dockerfile
   # See https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#variables-reference for available variables
   - /kaniko/executor --context $CI_PROJECT_DIR
     --dockerfile ${DOCKERFILE}
     --build-arg ${BUILD_ARG_1}
     --destination ${DESTINATION}
 only:
   # Only build if the generating files change
   changes:
     - ${DOCKERFILE}
     - entrypoint.sh
 except:
   - tags

Let’s now add two types of jobs: validation jobs that run on MRs and deployment jobs that run on master

stages:
  - build

.build_template:
  stage: build
  image:
    # Use CERN version of the Kaniko image
    name: gitlab-registry.cern.ch/ci-tools/docker-image-builder
    entrypoint: [""]
  variables:
    DOCKERFILE: Dockerfile
    BUILD_ARG_1: BASE_IMAGE=python:3.7
    TAG: latest
    # Use single quotes to escape colon
    DESTINATION: '${CI_REGISTRY_IMAGE}:${TAG}'
  before_script:
    # Prepare Kaniko configuration file
    - echo "{\"auths\":{\"$CI_REGISTRY\":{\"username\":\"$CI_REGISTRY_USER\",\"password\":\"$CI_REGISTRY_PASSWORD\"}}}" > /kaniko/.docker/config.json
  script:
    - printenv
    # Build and push the image from the given Dockerfile
    # See https://docs.gitlab.com/ee/ci/variables/predefined_variables.html#variables-reference for available variables
    - /kaniko/executor --context $CI_PROJECT_DIR
      --dockerfile ${DOCKERFILE}
      --build-arg ${BUILD_ARG_1}
      --destination ${DESTINATION}
  only:
    # Only build if the generating files change
    changes:
      - ${DOCKERFILE}
      - entrypoint.sh
  except:
    - tags

.validate_template:
  extends: .build_template
  except:
    refs:
      - master

.deploy_template:
  extends: .build_template
  only:
    refs:
      - master

# Validation jobs
validate python 3.7:
  extends: .validate_template
  variables:
    TAG: validate-latest

# Deploy jobs
deploy python 3.7:
  extends: .deploy_template
  variables:
    TAG: latest

python 3.8 jobs

What needs to be added to build python 3.8 images for both validation jobs and deployment jobs?

Solution

Just more jobs that have a different BASE_IMAGE variable


# ...
# Same as the above
# ...

# Validation jobs
validate python 3.7:
 extends: .validate_template
 variables:
   TAG: validate-latest

validate python 3.8:
 extends: .validate_template
 variables:
   BUILD_ARG_1: BASE_IMAGE=python:3.8
   TAG: validate-py-3.8

# Deploy jobs
deploy python 3.7:
 extends: .deploy_template
 variables:
   TAG: latest

deploy python 3.8:
 extends: .deploy_template
 variables:
   BUILD_ARG_1: BASE_IMAGE=python:3.8
   TAG: py-3.8

Run pipeline and build

Now add and commit the CI YAML, push it, and make a MR

Solution

git add .gitlab-ci.yml
git commit
git push -u origin feature/add-CI
# visit https://gitlab.cern.ch/<your user name here>/build-with-ci-example/merge_requests/new?merge_request%5Bsource_branch%5D=feature%2Fadd-CI

In the GitLab UI check the pipeline and the validate jobs and see that different Python 3 versions are being run. Once they finish, merge the MR and then watch the deploy jobs. When the jobs finish navigate to your GitLab Registry tab in your GitLab project UI and click on the link named <user name>/<build-with-ci-example> under Container Registry. Notice there are 4 container tags. Click on the icon next to the py-3.8 tag to copy its full registry name into your clipboard. This can be used to pull the image from your GitLab registry.

Pull your image from GitLab Registry

Pull your CI built python 3.8 image using its full registry name and run it!

Solution

docker pull gitlab-registry.cern.ch/<user name>/build-with-ci-example:py-3.8
docker run --rm -ti gitlab-registry.cern.ch/<user name>/build-with-ci-example:py-3.8
python3 --version
python 3.8.9

Summary

awesome

Key Points

  • CI can be used to build images automatically


Challenge Examples

Overview

Teaching: 0 min
Exercises: 20 min
Questions
  • How to do a few more things?

Objectives
  • How to run with SSH and not use cp

Get SSH credentials in a container without cp

Get SSH credentials inside a container without using cp

Solution

Mount multiple volumes

docker run --rm -ti \
 -w /home/atlas/Bootcamp \
 -v $PWD:/home/atlas/Bootcamp \
 -v $HOME/.ssh:/home/atlas/.ssh \
 -v $HOME/.gitconfig:/home/atlas/.gitconfig \
 atlas/analysisbase:21.2.85-centos7

Key Points

  • Containers are extensive