-
Notifications
You must be signed in to change notification settings - Fork 4
Docker Tips
Dolsy Smith edited this page Nov 3, 2023
·
8 revisions
Tips for working with Docker images and containers
Caveat: The following rules of thumb apply to running Docker on a Linux environment. On Mac OS and Windows, Docker behaves somewhat differently, due to differences in the underlying OS architecture on which Docker depends.
- Containers are not virtual environments. Rather, containers "contain" by isolating system resources.
- This isolation happens through the use of dedicated namespaces, one per container. This fact explains the mappings we use in the
docker-compose.yml
file. For example, the port mapping8984:8080
simply tells Docker to translate requests to port8984
(as made outside the container) to port8080
inside the container. Likewise,/opt/scholarspace/scholarspace-derivatives:/opt/scholarspace/scholarspace-derivatives
instructs Docker to assign one particular directory inside the container to a directory with the same path that exists outside the container. - Because container namespaces are isolated by default, without this mapping, processes running outside the container cannot access resources or communicate with processes running inside the container.
- Stopping a container (
docker stop [CONTAINER-NAME]
) interrupts any processes running inside the container, but it preserves the container's persistent storage (e.g, files saved to disk). - Removing a container (
docker rm [CONTAINER-NAME]
) removes the container's namespace. As a result, it also deletes the container's persistent storage (by freeing up those resources for use by the rest of the system). - A stopped container can be restarted (
docker start [CONTAINER-NAME]
), but a removed container cannot.
- A Docker image is similar to a disk image (as used in backing up/restoring your computer, for example).
- As an inexact analogy, I picture the Docker image as a frozen, prepackaged dinner bought at the supermarket, and the Docker container made from the image as the result of heating up the frozen dinner in the microwave.
- Inexact, because thanks to Docker's clever way of structuring resources (n.b., called "copy-on-write"), the same image can be used to create multiple containers without creating significant overhead on disk. (So maybe it's like a magical frozen dinner that can be eaten any number of times...)
- The Dockerfile is just a recipe for creating an image. Most images are built on top of other images, called layers. In fact, each separate command in a Dockerfile (
COPY
,ADD
,RUN
, etc.) creates a new layer. The resulting Docker image is a stack of such layers. - When rebuilding an image, only the layers whose contents have changed -- and those layers that come after them in the order of the Dockerfile -- will be recreated. This fact leads to the following principle: aspects of the image that are liable to change more frequently should, where possible, be included after those that change less frequently.
- For reference, in the ScholarSpace Dockerfile, the most expensive layers (in terms of time to build) are those that install ImageMagick and its dependencies, and those that install the Ruby gems for our application.
- The last command in a Dockerfile is typically
CMD
orENTRYPOINT
. The actions performed here -- usually in a separate script for convenience -- technically do not affect the image; rather they are run inside the container as it starts up. (These can be superseded by commands included in adocker run
command or in adocker-compose.yml
file.) The logic in such scripts can be used to customize a container's runtime environment, depending on certain conditions.- For instance, the same Dockerfile/image is used to create both the Hyrax
app-server
container and thesidekiq
container, but whereas the former is started with commands that launch Nginx/Passenger, the latter is started with a command that launches Sidekiq. Otherwise, the containers are identical.
- For instance, the same Dockerfile/image is used to create both the Hyrax
- Changes to the filesystem or system configuration made while a container is running do not outlive the container. This fact leads to the following principles:
- Immutable and invariant files and settings -- those that do not change between releases of the application, and which do not depend on the local environment of installation -- should comprise the Docker image. This category includes application code and application dependencies, as well as some system settings.
- Runtime settings, if implemented at container startup, can be be stored using the container's ephemeral storage.
- Data created by the application should generally be stored using one of two methods, described below.
- Docker offers the following methods of persistence (in addition to images):
- Bind mounts: Paths (directories and files) inside the container are mapped to paths outside the container. In this arrangement, Docker will allow processes inside the container to read and write to locations in the host filesystem.
- Docker volumes: Containers write to dedicated, persistent storage managed by the Docker daemon itself.
The following table summarizes key differences between these approaches.
Bind Mounts | Docker Volumes | |
---|---|---|
Access | Managed by the host system. Container processes not running as root will need permission, granted to their users outside the container. | Managed by Docker. No direct access outside of a running container. |
Users | Container users/groups must match those on the host system. Because Docker containers have their own namespaces, a scholarspace or solr user created inside the container will not be the same as a scholarspace or solr user created outside the container. Therefore, it is necessary to either a) assign privileges outside the container using the in-container user id (uid ) and group id (gid ), or b) create users inside the container matching a uid and gid known outside the container. |
Permissions depend on how the resources were created, in isolation from the host. |
Most useful for | - Files that "live" outside the application, such as data that may be migrated. - Files that need to be modified by users outside of the application, such as application code in a development environment. |
Application data coupled closely with the application, such as SQL database files. |
- The relationship between volumes and images is a little counterintuitive. Specifically, changes made to an image (e.g., when the image is rebuilt) will not automatically populate to the Docker volume. When changing an image whose containers use a Docker volume, it's necessary to delete the Docker volume (
docker volume rm [VOLUME-NAME]
) as well as deleting the image (docker image rm [IMAGE-NAME]
) before rebuilding. - As the foregoing demonstrates, it is alarmingly easy to delete a Docker volume. Care is warranted when dealing with volumes that persist important data (e.g., database files).
- Like storage, Docker offers two ways of managing network access (communication via ports, etc.)
- In port mapping, ports within the container will be mapped onto ports outside the container.
-
Example: In our application, the Fedora Jetty server runs on port
8080
inside its container. This port is mapped to8984
outside the container, so that from the host, doingcurl localhost:8984
connects you to the Fedora server. - Likewise, the Hyrax
app-server
container maps443
to the same port outside the container, allowing HTTPS requests to be routed to the Nginx instance running inside that container.
-
Example: In our application, the Fedora Jetty server runs on port
- Using a Docker network driver, we let Docker manage connections between containers.
- Our
docker-compose.yml
file defines a couple of networks:hyrax
andfedora
. - Each container connects to one or both networks.
- Each container that receives connections from others has its own hostname. For example, the Solr container is attached to the
hyrax
network with the hostnamesolr-hyrax
. This allows theapp-server
andsidekiq
containers to reach the Solr server at an address likehttp://solr-hyrax:8983
. (Comparable tohttp://localhost:8983
on a non-Dockerized setup.)
- Our
- A
docker-compose.yml
file is a set of instructions for launching one or more Docker containers. - Containers may be built locally (useful for dev environments) or created from hosted images (best for production).
- To use a hosted image, we use the
image
directive:image: postgres:9.5.25-alpine
- To build locally, we use a
build
directive in conjunction with theimage
directive. The directive below builds an image from the local context and names itscholarspace-app
.image: scholarspace-app build: context: .
- To use a hosted image, we use the
- The build context is the directory relative to which the Dockerfile's
COPY
orADD
instructions are carried out. Usually, this context will be.
, referring to the root directory of the repository. - Other Docker Compose directives used in running our application include the following:
-
volumes
: associates bind-mounts and/or Docker volumes with a given container -
networks
: associates Docker networks with a given container -
ports
: assigns port mapping (used for connecting to the container from the outside, i.e., from the host machine) -
environment
: enumerates environment variables (from a.env
file in the same directory asdocker-compose.yml
) that will be passed into the container. Alternately, theenv_file
directive can be used to pass in the complete contents of a.env
file. -
command
: starts a container with a particular command (if this is different from theCMD
orENTRYPOINT
commands in the Dockerfile/image).
-
- At the bottom of the
docker-compose.yml
file, we also define Docker volumes (by simply naming them) and Docker networks. (Configuration for the latter is, for our app, fairly boilerplate). - At the command line,
docker compose up -d
starts all the containers as background processes.docker compose down
gracefully shuts down all containers and deletes them (equivalent todocker stop [CONTAINER-NAME]
anddocker rm [CONTAINER-NAME]
). Note that unlike the equivalent Docker CLI commands,docker compose
must be run in the directory that houses thedocker-compose.yml
file.
The following is a non-exhaustive list of commands useful for interacting with ScholarSpace containers, images, and volumes.
Task | Command | Notes |
---|---|---|
List running containers | docker ps |
|
List running & stopped containers | docker ps -a |
|
Show the logs from a container | docker logs [CONTAINER-NAME] |
|
Follow the logs from a container | docker logs [CONTAINER-NAME] -f |
|
Skip to the last 100 lines of a container's logs | docker logs --tail 100 [CONTAINER-NAME] |
|
Search a container's logs | docker logs [CONTAINER-NAME] 2>&1 | grep "[SEARCH-STRING]" |
|
Stop a container | docker stop [CONTAINER-NAME] |
|
Restart a container | docker start [CONTAINER-NAME] |
|
Delete a stopped container | docker rm [CONTAINER-NAME] |
|
Recreate containers after deleting | docker compose up -d |
Recreates any deleted containers |
Open a Bash shell in a container (non-Hyrax) | docker exec -it [CONTAINER-NAME] /bin/bash |
|
Open a Bash shell in a Hyrax container | docker exec -it --user scholarspace [CONTAINER-NAME] bash -l |
The slightly different syntax is due to the nature of the underlying image. |
Run a command in a Hyrax container (e.g., a Rake task) | docker exec -it --user scholarspace [CONTAINER-NAME] bash -lc "[COMMAND]" |
|
Show all images | docker image ls |
|
Delete an image | docker image rm [IMAGE-NAME] |
Deleting a container and then its image will force the image to be rebuilt/redownloaded |
Show all Docker volumes | docker volume ls |
|
Delete a Docker volume | docker volume rm [VOLUME-NAME] |
Use carefully! This will irrevocable delete all volume contents, leading to permanent data loss. |