Docker hero? Check yourself!

Docker hero? Check yourself!

#ci/cd #docker #dockerfile #layers #python #secrets #security
Shem Tov Fisher
August 31, 2023

A story of one Dockerfile

The story below describes a progressive work on one Dockerfile. You will face a few real-life situations when a Docker build process was not trivial. We will touch on performance, security, and workflow optimization scopes.

When you read these use cases, ask yourself about each one: What would I do? What is the best practice here? Do I see any issues? 

For each situation, I provide an answer, which, in my experience, uncovers the best practice to solve the challenge within the Dockerfile. You’re invited to compare our solutions!

Сhallenge 1

A junior DevOps engineer, let’s call him James, teamed to support Python developers. His first task was a dockerization of a Python application. He found out that the developers are already using this Dockerfile:

FROM python:3.7
COPY ./requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8081 8082 8083
CMD [“./app.py”]

James started checking if he could optimize it. He saw that the COPY command is used twice. Since each instruction in Dockerfile is a layer, James decided he could reduce the number of layers. Thus, he rewrote the Dockerfile this way:

FROM python:3.7
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8081 8082 8083
CMD [“./app.py”]

There is one layer less now. Also, having fewer commands results in improved readability. 

He is a Docker Hero, isn’t he?

Сhallenge 2

When reviewing the repo, James caught sight that the requirements file contains a private token:

> cat requirements.txt
git+https://__token__:glpat-4HeUio8dHeaWnNg6@gitlab.com/acme-apps/config-repo.git
…

James was unpleasantly surprised by this security breach. 

In the beginning, he only wanted to fix this vulnerability in the public Docker image, so he updated the Dockerfile this way: 

RUN pip install -r requirements.txt
RUN rm requirements.txt

However, he still wasn’t happy having a plain secret stored in git. So he did the following:

He replaced the token with a placeholder this way:

git+https://__token__:GIT_TOKEN@gitlab.com/acme-apps/config-repo.git

Then he stored the token in a vault, pulled it in CI/CD Build stage, and replaced the placeholder with sed command before building the Dockerfile:

TOKEN=$(<pull token from a vault>)
sed -i 's/GIT_TOKEN/${TOKEN}/' requirements.txt
docker build .

He is a Docker Hero, isn’t he? The docker image is secure now, isn’t it?

Сhallenge 3

As an alternative solution, James used build arguments.

He added this instruction at the beginning of the Dockerfile:

ARG GITLAB_TOKEN

and changed the requirements file accordingly:

git+https://__token__:${GITLAB_TOKEN}@gitlab.com/acme-apps/config-repo.git

And CI command is now:

docker build --build-arg GITLAB_TOKEN=$(<pull the token from a vault>) .

No secret is stored in Git and in the image. James is calm and happy now. 

Finally, he is the Docker Hero, isn’t he? Is the docker image finally secure?

Сhallenge 4

After all these security adventures, James wants to keep the image as secure as possible. He looked at the COPY . . instruction and started to fear that unnecessary files might be copied to the image, and sensitive ones among others.

He asked developers to copy only the required files. They changed the Dockerfile, and now it looks like this:

COPY app.py convertor.py input.csv shared/ images/ sql/ .

It’s much more secure now, isn’t it?

Сhallenge 5

James is still eager for his idea to reduce the number of redundant layers. He remembered that the EXPOSE instruction in Dockerfile does not expose the ports indeed and serves mainly for documentation. 

“It serves equally good as documentation when it is commented out”, – he thought. 

So he updated the Dockerfile this way:

# EXPOSE 8081 8082 8083

You probably already have a hunch that James is not the Docker Hero, but what could be wrong?

Afterword

What every James should know? 

  • Don’t optimize what’s already working unless you clearly understand the benefit and the outcome. 
  • After applying changes – always test everything.

I hope you enjoyed reading and learned something new, but even if you knew everything – you’ve just confirmed that you are the true Docker Hero!