- Introduction
- Requirements
- Building
- Running
- Docker Images
- Design Considerations
- Design Constraints
- Issues / FAQ
This repository provides a number of docker images which can be used to build an Islandora 8 site. On commit, these images are automatically pushed to Docker Hub via Github Actions. Which are consumed by isle-dc and can be used by other Docker orchestration tools such as Swarm / Kubernetes.
It is not meant as a starting point for new users or those unfamiliar with Docker, or basic server adminstration.
If you are looking to use islandora please read the official documentation and use either isle-dc to deploy via Docker or the islandora-playbook to deploy via Ansible.
To build the Docker images using the provided Gradle build scripts requires:
That being said the images themselves are compatible with older versions of Docker.
The build scripts rely on Gradle and should function equally well across platforms. The only difference being the script you call to interact with gradle (the following assumes you are executing from the root directory of the project):
Linux or OSX:
./gradlew
Windows:
gradlew.bat
For the remaining examples the Linux or OSX call method will be used, if using Windows substitute the call to Gradle script.
Gradle is a project/task based build system to query all the available tasks use the following command.
./gradlew tasks --all
Which should return something akin to:
> Task :tasks
------------------------------------------------------------
Tasks runnable from root project
------------------------------------------------------------
...
Islandora tasks
---------------
abuild:build - Creates Docker image.
activemq:build - Creates Docker image.
alpaca:build - Creates Docker image.
base:build - Creates Docker image.
...
In Gradle each Project maps onto a folder in the file system path where it is
delimited by :
instead of /
(Unix) or \
(Windows).
The root project :
can be omitted.
So if you want to run a particular task taskname
that resided in the project
folder project/subproject
you would specify it like so:
./gradlew :project:subproject:taskname
To get more verbose output from Gradle use the --info
argument like so:
./gradlew :PROJECT:TASK --info
To build all the docker images you can use the following command:
The following will build all the images in the correct order.
./gradlew build
To build a specific image and it's dependencies, for example
islandora/tomcat
, you can use the following:
./gradlew tomcat:build
It is often helpful to build continuously where-in any change you make to any of
the Dockerfiles or other project files, will automatically trigger the building
of that image and any downstream dependencies. To do this add the
--continuous
flag like so:
./gradlew build --continuous
When this is combined with the use of watchtower
and
restart: unless-stopped
in a docker-compose.yml
file. Images will be
redeployed with the latest changes while you develop automatically.
There is no method for running the containers in isle-buildkit
, instead please
refer to isle-dc.
The following docker images are provided:
- abuild
- activemq
- alpaca
- base
- blazegraph
- cantaloupe
- crayfish
- crayfits
- demo
- drupal
- fcrepo
- fits
- handle
- homarus
- houdini
- hypercube
- imagemagick
- java
- karaf
- mariadb
- matomo
- milliner
- nginx
- postgresql
- recast
- solr
- tomcat
Many are intermediate images used to build other images in the list, for example java. Please see the README of each image to find out what settings, and ports, are exposed and what functionality it provides.
All of the images build by this project are derived from the
Alpine Docker Image which is a Linux
distribution built around musl
libc
and BusyBox
. The image is only 5
MB in size and has access to a package repository. It has been chosen for its
small size, and ease of generating custom packages (as is done in the
imagemagick image).
The base image includes two tools essential to the functioning of all the images.
- Confd - Configuration Management
- S6 Overlay - Process Manager / Initialization system.
confd
is used for all Configuration Management, it is how images are
customized on startup and during runtime. For each Docker image there will be a
folder rootfs/etc/confd
that has the following layout:
./rootfs/etc/confd
├── conf.d
│ └── file.ext.toml
└── templates
└── file.ext.tmpl
The file.ext.toml
and file.ext.tmpl
work as a pair. The toml
file
defines where the template will be render to and who owns it, and the tmpl
file being the template in question. Ideally these files should match the same
name of the file they are generating minus the toml
or tmpl
suffix. This
is to make their discovery easier.
Additionally in the base
image there is confd.toml
which sets defaults
such a the log-level
:
backend = "env"
confdir = "/etc/confd"
log-level = "error"
interval = 600
noop = false
confd
is also the source of all truth when it comes to configuration. We
have established a order of precedence in which environment variables at runtime
are defined.
- Confd backend (highest)
- Secrets kept in
/run/secrets
(Except when usingKubernetes
) - Environment variables passed into the container
- Environment variables defined in Dockerfile(s)
- Environment variables defined in the
/etc/defaults
directory (lowest only used for multiline variables, such as JWT)
If not defined in the highest level the next level applies and so forth down the list.
N.B.
/etc/defaults
and the environment variables declared in the Dockerfile(s) used to create the image are required to define all environment variables used by scripts andconfd
templates. If not specified in either of those locations the environment variables will not be available even if its defined at a higher level i.e.confd
.
The logic which enforces these rules is performed in 00-container-environment-00-init.sh
N.B Some containers derive environment variables dynamically from other environment variables. In these cases they are expected to provided an additional startup script prefixed with
00-container-environment-01-*.sh
so that the variables are defined beforeconfd
is used to render templates.
By either using the command with-contenv
or starting a script with
#!/usr/bin/with-contenv bash
the environment defined will follow the order
of precedence above. Additionally Within confd
templates it is required
to use getenv
function for fetching data.
From this tool we only really take advantage of two features:
- Initialization scripts (found in
rootfs/etc/cont-init.d
) - Service scripts (found in
rootfs/etc/services.d
)
Initialization scripts are run when the container is started and they execute in alphabetical order. So to control the execution order they are prefix with numbers.
One initialization script 01-confd-render-templates.sh
is shared by all the
images. It does a first past render of the confd
templates so subsequent
scripts can run. The rest of the scripts do the minimal steps required to get
the container into a ready state before the Service scripts start.
The services scripts have the following structure:
./rootfs/etc/services.d
└── SERVICE_NAME
├── finish
└── run
The run
script is responsible for starting the service in the
foreground. The finish
script can perform any cleanup necessary before
stopping the service, but in general it is used to kill the container, like so:
s6-svscanctl -t /var/run/s6/services
There are only a few Service scripts:
- activemq
- confd
- fpm
- karaf
- mysqld
- nginx
- solr
- tomcat
Of these only confd
can be configured to run in every container, it
periodically listens for changes in it's configured backend (e.g. etcd
or
environment variables
) and will re-render the templates upon any change.
In order to save space and reduce the amount of duplication across images, they are arranged in a hierarchy, that roughly follows below:
├── abuild
│ └── imagemagick
└── base
├── java
│ ├── activemq
│ ├── karaf
│ │ └── alpaca
│ ├── solr
│ └── tomcat
│ ├── blazegraph
│ ├── cantaloupe
│ ├── fcrepo
│ └── fits
├── mariadb
└── nginx
├── crayfish
│ ├── homarus
│ ├── houdini (consumes "imagemagick" as well during its build stage)
│ ├── hypercube
│ ├── milliner
│ └── recast
├── crayfits
├── drupal
│ └── demo
└── matomo
abuild and imagemagick stand outside of the hierarchy as they are use only to build packages that are consumed by other images during their build stage.
To make reasoning about what files go where each image follows the same filesystem layout for copying files into the image.
A folder called rootfs
maps directly onto the linux filesystem of the final
image. So for example rootfs/etc/islandora/configs
will be
/etc/islandora/configs
in the generated image.
Gradle is used as the build system, it is setup such that it will automatically detect which folders should be considered projects and what dependencies exist between them. The only caveat is that the projects cannot be nested, though that use case does not really apply.
The dependencies are resolved by parsing the Dockerfile and looking for:
FROM
statements--mount=type=bind
statementsCOPY --from
statements
As they are capable of referring to other images.
This means to add a new Docker image to the project you do not need to modify
the build scripts, simply add a new folder and place your Dockerfile inside of
it. It will be discovered and built in the correct order relative to the other
images assuming you refer to the other image using the repository
build
argument.
For example:
ARG repository=local
ARG tag=latest
FROM ${repository}/base:${tag}
To be able to support a wide variety of backends for confd
, as well as
orchestration tools, all calls must use getenv for the default
value. With the exception of keys that do not get used unless defined like
DRUPAL_SITE_{SITE}_NAME
. This means the whatever backend for configuration,
wether it be etcd
, consul
, or environment variables
, containers can
successfully start without any other container present. Additionally it ensure
that the order of precedence for configuration settings.
This does not completely remove dependencies between containers, for example,
when the demo starts it requires a running
fcrepo to be able to ingest nodes created by
islandora_default
features. In these cases an initialization script can
block until another container is available or a timeout has been reached. For
example:
local fcrepo_url=
# Indexing fails if port 80 is given explicitly.
if [[ "${DRUPAL_DEFAULT_FCREPO_PORT}" == "80" ]]; then
fcrepo_url="http://${DRUPAL_DEFAULT_FCREPO_HOST}/fcrepo/rest/"
else
fcrepo_url="http://${DRUPAL_DEFAULT_FCREPO_HOST}:${DRUPAL_DEFAULT_FCREPO_PORT}/fcrepo/rest/"
fi
#...
# Need access to Solr before we can actually import the right config.
if timeout 300 wait-for-open-port.sh "${DRUPAL_DEFAULT_FCREPO_HOST}" "${DRUPAL_DEFAULT_FCREPO_PORT}" ; then
echo "Fcrepo Found"
else
echo "Could not connect to Fcrepo"
exit 1
fi
This allows container to start up in any order, and to be orchestrated by any tool.
Question: I'm getting the following error when building:
failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: runc did not terminate successfully: context canceled
Answer: If possible upgrade Docker to the latest version, and switch to using the Overlay2 filesystem with Docker.