Every Linux container is based on an image, which serves as the blueprint for what becomes a running container. Docker or Open Container Initiative (OCI) images form the foundation for everything you deploy and run with Docker. To launch a container, you either download a public image or create your own. An image essentially represents the filesystem for the container, though it's comprised of linked filesystem layers corresponding to each build step.
Images, built from individual layers, place unique demands on the Linux kernel, requiring drivers for Docker's storage backend. Docker heavily relies on this backend for image management, communicating with the Linux filesystem to create and manage the layers forming a single usable image. The primary supported storage backends include Overlay2, B-Tree File System (Btrfs), and Device Mapper, each providing a fast copy-on-write system.
To craft a custom Docker image using default tools, familiarity with Dockerfile is essential. This file outlines the steps needed to create an image and is typically found in the root directory of your application's source code repository.
A typical Dockerfile might resemble the one below, creating a Node.js-based application container:
FROM node:18.13.0
ARG email="[email protected]"
LABEL "maintainer"=$email
LABEL "rating"="Five Stars" "class"="First Class"
USER root
ENV AP /data/app
ENV SCPATH /etc/supervisor/conf.d
RUN apt-get -y update
# Install daemons
RUN apt-get -y install supervisor
RUN mkdir -p /var/log/supervisor
# Configure Supervisor
COPY ./supervisord/conf.d/* $SCPATH/
# Application Code
COPY *.js* $AP/
WORKDIR $AP
RUN npm install
CMD ["supervisord", "-n"]
This Dockerfile illustrates several instructions for controlling image assembly. Each line in a Dockerfile creates a new image layer, containing the changes resulting from that command. Consequently, when building new images, Docker only needs to build layers that deviate from previous builds.
While you could build a Node instance from a plain Linux image, Docker Hub offers official Node images for convenience. The ARG parameter sets variables and their default values, available only during the image build process.
Applying labels to images and containers allows metadata addition via key/value pairs for later identification. For instance, the "maintainer" label leverages the value of the email build argument defined earlier in the Dockerfile.
Adjusting the container's user with the USER instruction is possible, promoting best practices for security.
Using the ENV instruction sets shell variables for configuration during both build and runtime, aiding in Dockerfile simplicity and avoiding repetition.
The subsequent code section utilizes RUN instructions to initiate file structure creation, install necessary dependencies, and configure daemons.
COPY instruction copies files from the local filesystem into the image, often including application code and support files. It utilizes previously defined build variables to streamline the process.
WORKDIR instruction changes the working directory in the image for subsequent build instructions and the default process launched with resulting containers.
The order of commands in a Dockerfile significantly impacts ongoing build times, prioritizing steps with frequent changes towards the end.
Finally, the CMD instruction defines the command launching the desired process within the container, typically encouraging a single process per container for architectural simplicity.
To initiate image building, clone a Git repo containing an example application, such as docker-node-hello. Ensure Docker server and client communication is operational before building. Use the following command to clone the repo:
$ git clone https://github.com/spkane/docker-node-hello.git \
--config core.autocrlf=input
This downloads a working Dockerfile and related source code into the docker-node-hello directory. The .dockerignore file, alongside the Dockerfile, defines files and directories excluded from the image build, enhancing efficiency by ignoring the .git directory.
Inspecting the repo, you'll find relevant files like Dockerfile, .dockerignore, index.js, package.json, and a supervisord directory containing configuration files.
With the Dockerfile and related source code available, build the image using Docker's build command. Each step in the build process maps to a line in the Dockerfile, creating new image layers. Subsequent builds should be quicker after the initial image download.
$ docker image build -t example/docker-node-hello:latest .
For faster builds, Docker employs a local cache, though it might lead to unexpected issues. Use --no-cache to disable caching for a build.
If building on a system with concurrent processes, resource limitation is possible using cgroup methods discussed later. Refer to official Docker documentation for detailed build arguments.
After successfully building the image, run it on your Docker host with the following command, creating a running container mapping port 8080:
$ docker container run --rm -d -p 8080:8080 example/docker-node-hello:latest
This command spawns a container in the background from the example/docker-node-hello:latest image, with port 8080 on the Docker host mapped to port 8080 in the container. Verify the running application by accessing port 8080 on the Docker host via a web browser.
To configure the application via environment variables, stop the existing container and start a new one with the desired variables:
$ docker container stop [container_id]
$ docker container run --rm -d -p 8080:8080 \
-e WHO="Sean and Karl" \
example/docker-node-hello:latest
Replace [container_id] with the ID of the existing container. Now, the application should greet "Sean and Karl" instead of the default "World."
Docker Hub
Docker Hub is a public registry provided by Docker, Inc. It hosts a vast collection of Docker images, including official images for popular software packages like Linux distributions and applications such as WordPress.
To push an image to Docker Hub, you need to first log in using your Docker ID:
docker login
Then, tag your image with your Docker Hub username and push it:
docker tag your-image your-dockerhub-username/your-image
docker push your-dockerhub-username/your-image
Quay.io
Quay.io is another public registry, now owned by Red Hat. It offers similar features to Docker Hub, including the ability to host public and private images.
The process to push an image to Quay.io is similar to Docker Hub:
docker login quay.io
docker tag your-image quay.io/your-quay-username/your-image
docker push quay.io/your-quay-username/your-image
Harbor
Harbor is a private registry solution that provides features like image verification and GUI interfaces. It's often used by companies that require more control over their image hosting.
To use Harbor, you typically need to set it up on your own infrastructure. Once set up, pushing an image to Harbor is similar to pushing to Docker Hub or Quay.io:
docker login harbor.example.com
docker tag your-image harbor.example.com/your-project/your-image
docker push harbor.example.com/your-project/your-image
Red Hat Quay
Red Hat Quay is another private registry solution with advanced features like image scanning and vulnerability detection.
Similar to Harbor, pushing an image to Red Hat Quay involves logging in and tagging the image correctly:
docker login quay.example.com
docker tag your-image quay.example.com/your-organization/your-image
docker push quay.example.com/your-organization/your-image
Authenticating to a Registry
Docker requires authentication to access private registries. You can authenticate using your username and password:
docker login registry.example.com
Setting up a private registry involves several steps:
Prepare SSL Certificates: Create SSL certificates for secure communication with the registry.
Set up Authentication: Configure authentication to restrict access to authorized users.
Deploy Registry: Run the Docker registry container with appropriate configurations.
Here's an example using Docker's official registry image:
# Create SSL certificates
openssl req -newkey rsa:4096 -nodes -sha256 -keyout domain.key -x509 -days 365 -out domain.crt
# Run the registry container
docker run -d -p 5000:5000 --restart=always --name registry \
-v "$(pwd)"/registry:/var/lib/registry \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
registry:2
Pushing Images: Tag and push your images to the private registry:
docker tag your-image localhost:5000/your-image
docker push localhost:5000/your-image
Pulling Images: Pull images from the private registry:
docker pull localhost:5000/your-image
These steps should get you started with running your own private Docker registry. Remember to replace registry.example.com
or quay.example.com
with your actual registry domain.
Working efficiently with Docker involves keeping image sizes small and build times fast. This not only enhances deployment speed but also minimizes resource consumption. Let's explore some key considerations and techniques for achieving these objectives.
In contemporary environments, the size of software downloads may seem trivial. However, in large-scale deployments where software is frequently updated across numerous nodes, the impact of large image sizes becomes evident. Network congestion and slower deployment cycles can hamper production environments.
Many Linux containers inherit from a base image containing a minimal Linux distribution, but this is not mandatory. Containers only need files essential for running the application on the host kernel. Let's illustrate this with a minimal container example.
Example: Minimal Go Web Application Container
Go, being a compiled language, generates statically compiled binary files. Consider a small web application written in Go available on GitHub.
To try out the application, run the following command:
$ docker container run --rm -d -p 8080:8080 spkane/scratch-helloworld
If successful, access the application at http://127.0.0.1:8080
in your web browser.
Now, let's examine the files in this container:
$ docker container ls -l
CONTAINER ID IMAGE COMMAND CREATED ...
ddc3f61f311b spkane/scratch-helloworld "/helloworld" 4 minutes ago ...
$ docker container export ddc3f61f311b -o web-app.tar
$ tar -tvf web-app.tar
You'll notice that besides the application binary, most files in the container are either zero-length or critical system files. This emphasizes that containers should only contain what's necessary to run on the underlying kernel.
To achieve smaller images, consider using multistage builds. This approach enables building production containers with minimal resources while ensuring repeatability in the build system.
Multistage Builds Example:
# Build container
FROM docker.io/golang:alpine as builder
RUN apk update && \
apk add git && \
CGO_ENABLED=0 go install -a -ldflags '-s' \
github.com/spkane/scratch-helloworld@latest
# Production container
FROM scratch
COPY --from=builder /go/bin/scratch-helloworld /helloworld
EXPOSE 8080
CMD ["/helloworld"]
This Dockerfile demonstrates a multistage build for creating a minimal production container. The builder
stage compiles the Go application, while the scratch
stage creates the final lightweight image containing only the necessary binary.
In practice, multistage builds significantly reduce image size and resource overhead, making them ideal for production environments.
Understanding Docker image layers is crucial. Each layer is strictly additive, meaning once created, its contents cannot be removed. Although files can be shadowed in subsequent layers, earlier layers cannot be made smaller by deleting files in later layers.
While you can squash layers using experimental Docker features, it's essential to consider the trade-offs. Squashing layers can reduce wasted space but may increase image download size.
The additive nature of image layers is evident when examining the filesystem layers and build steps. Modifications made in subsequent layers do not reduce the size of earlier layers; they only mask or overwrite existing files.
FROM docker.io/fedora
# Install Apache web server
RUN dnf install -y httpd && \
dnf clean all
# Create directories for web content
RUN mkdir -p /var/www && \
mkdir -p /var/www/html
# Add custom index.html file
ADD index.html /var/www/html
# Start Apache server
CMD ["/usr/sbin/httpd", "-DFOREGROUND"]
In this Dockerfile, commands are ordered to maximize cache utilization. Stable and time-consuming steps, like installing Apache and creating directories, come before adding the custom index.html
file. This ensures that changes to the index.html
file invalidate the cache minimally.
# syntax=docker/dockerfile:1
FROM python:3.9.15-slim-bullseye
# Create app directory
RUN mkdir /app
WORKDIR /app
# Copy application code
COPY . /app
# Install dependencies using pip with directory caching
RUN --mount=type=cache,target=/root/.cache pip install -r requirements.txt
# Set working directory for application
WORKDIR /app/mastermind
# Define command to run the application
CMD ["python", "mastermind.py"]
Here, we utilize BuildKit's directory caching feature by specifying --mount=type=cache
in the RUN
command for installing Python dependencies. This mounts a caching layer into the container, speeding up subsequent builds by reusing cached dependencies.
These Dockerfile examples demonstrate effective techniques for optimizing Docker builds, ensuring faster build times and smaller image sizes by intelligently leveraging layer and directory caching mechanisms.
When encountering issues with Docker image builds, it's essential to diagnose and resolve them efficiently. Let's delve into the steps for troubleshooting broken builds, along with code examples.
Suppose you encounter a failed build in a pre-BuildKit environment. In such cases, you can utilize intermediate containers to isolate and address the problem. Let's illustrate this with an example using the docker-hello-node
repository.
First, create a failing build by modifying the Dockerfile:
# Dockerfile
...
RUN apt-get -y update-all
Change it to:
# Dockerfile
...
RUN apt-get -y update
Now, attempt to build the image:
$ DOCKER_BUILDKIT=0 docker image build -t example/docker-node-hello:latest --no-cache .
You'll encounter an error due to the invalid operation update-all
. To troubleshoot, start an interactive container from the last successful step:
$ docker container run --rm -ti 2a236efc3f06 /bin/bash
Inside the container, investigate the issue:
root@b83048106b0f:/# apt-get -y update-all
E: Invalid operation update-all
root@b83048106b0f:/# apt-get --help
apt 1.4.9 (amd64)
...
Once you identify the root cause, modify the Dockerfile accordingly and rebuild the image:
$ DOCKER_BUILDKIT=0 docker image build -t example/docker-node-hello:latest .
In BuildKit environments, debugging involves a slightly different approach. Let's simulate a failed build by modifying the Dockerfile:
# Dockerfile
...
RUN npm installer
Change it to:
# Dockerfile
...
RUN npm install
Now, attempt to build the image:
$ docker image build -t example/docker-node-hello:debug --no-cache .
You'll encounter an error indicating an invalid command. To troubleshoot, utilize multistage builds and the --target
argument:
# Dockerfile
...
FROM deploy
RUN npm installer
Modify the Dockerfile to include a new FROM
line before the problematic step. Then, build only the first stage:
$ docker image build -t example/docker-node-hello:debug --target deploy .
Create a container from the debug image and perform necessary tests:
$ docker container run --rm -ti docker.io/example/docker-node-hello:debug /bin/bash
Once the issue is identified, revert the changes and rebuild the image.
Supporting multiple architectures is crucial in modern computing environments. Docker's buildx plugin simplifies this process. Let's demonstrate building an image for multiple architectures:
$ docker buildx build --platform linux/amd64,linux/arm64 --tag wordchain:test .
This command builds the image for both AMD64 and ARM64 platforms. Verify the built images by running:
$ docker image ls
Troubleshooting broken builds is a critical aspect of Docker image development. By understanding the tools and techniques outlined above, developers can efficiently diagnose and resolve issues, ensuring smooth and reliable image builds across different environments.