Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[minor] add Dockerfile healthchecks to some core services #370

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

joecorall
Copy link
Contributor

@joecorall joecorall commented Jan 14, 2025

Inspired by seeing cantaloupe and solr having mysteriously died (though rarely) in my local instance over the past year, added some healthchecks so docker compose can report on container health.

I tested this locally by setting the CMD's in this PR in my docker compose YML's healthcheck.

$ docker ps --format "{{.Names}}\t{{.Status}}"
lehigh-d10-fcrepo-prod-1        Up 22 hours (healthy)
lehigh-d10-activemq-prod-1      Up 22 hours (healthy)
lehigh-d10-solr-prod-1  Up 22 hours (healthy)
lehigh-d10-drupal-prod-1        Up 22 hours (healthy)
lehigh-d10-cantaloupe-prod-1    Up 22 hours (healthy)
lehigh-d10-mariadb-prod-1       Up 22 hours (healthy)
lehigh-d10-memcached-prod-1     Up 22 hours (healthy)
lehigh-d10-traefik-prod-1       Up 22 hours (healthy)

If we instead define the healthcheck in the dockerfile, docker compose will automatically pick up on the command and we won't need to define a healthcheck section in the YML. Docker compose will use the default values for interval/timeout/start_period/start_interval which could be overriden in the docker compose yml.

Minor changes

I see this as a new feature on the buildkit images, so marking this PR as a minor version bump.

The only change that might effect folks is this PR does enable the API on cantaloupe, since that project requires that API to be enabled to access its health check route. Though I did add some protection since this will turn on the API to automatically set the API secret if it's a blank string (and it will get rotated every time the container restarts) since most people probably are just using the defaults.

Out of scope

Locally I also run a bash script on a systemd timer to check the health on the containers and attempt to restart automatically. I think a script like that might belong better in islandora documentation or in site template, so leaving it out of this PR. I am turning that script into a go utility that can be run on the host or within a docker compose service if the docker socket is mounted in the container: https://github.com/lehigh-university-libraries/docker-autoheal

@joecorall joecorall marked this pull request as ready for review January 14, 2025 20:53
@joecorall
Copy link
Contributor Author

joecorall commented Jan 15, 2025

@seth-shaw-asu

Testing instructions

Create a new isle-site-template or isle-dc install using the ISLANDORA_TAG=healthchecks (instead of main). Let it run for a minute or two then run

docker ps --format "{{.Names}}\t{{.Status}}"

then ensure activemq, fcrepo, drupal, mariadb, solr, tomcat containers are marked as healthy

Though the tests that run here in CI are already doing basically this. SO maybe it's already tested 🤷

@nigelgbanks
Copy link
Contributor

This is great work! I've started to dig into this for a few hours today, and I'm working on a pull request to your branch, to address some issues I've seen, it's not ready yet. I just wanted to highlight it, though, so it doesn't go through before you have a chance to see my feedback.

@joecorall joecorall requested a review from nigelgbanks January 15, 2025 20:45
@seth-shaw-asu seth-shaw-asu removed their request for review January 15, 2025 21:18
@seth-shaw-asu
Copy link
Contributor

I'll defer to @nigelgbanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants