Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop pods downloading the DB on startup #15878

Open
stevejalim opened this issue Jan 15, 2025 · 3 comments
Open

Stop pods downloading the DB on startup #15878

stevejalim opened this issue Jan 15, 2025 · 3 comments
Assignees

Comments

@stevejalim
Copy link
Collaborator

At the moment, when Bedrock starts, we have to run the init container which will download the sqlite DB as well as the l10n files, even though only the l10n files are needed.

If we turn off downloading the DB, the deployment fails.

It looks like we need to remove the check for the sentinel file that shows the DB has been downloaded

@stevejalim stevejalim self-assigned this Jan 15, 2025
@stevejalim
Copy link
Collaborator Author

stevejalim commented Jan 15, 2025

So, we could "just" delete this line from run-prod.sh:

"data/last-run-download_database"

Which would then let run-prod.sh just work without needing a fresh sqlite DB (remember that there's a recent copy of the DB baked into the Docker image)

This would, however, mean that anything running sqlite as its DB would boot up with slightly stale data, potentially, which might cause us some surprises.

Things to check that might misbehave if the sqlite DB isn't totally brand new:

  • demo servers
  • www-sitemap-generator

Alternatively, we could make the inclusion of data/last-run-download_database in STARTUP_FILES conditional, based on whether postgres is in use or not

@stevejalim
Copy link
Collaborator Author

Also, in the k8s deployment, we call cron.py file (once for the initContainer, but alos regularly for the data-sync pod). This, in turn, downloads the DB again, which is just a waste. so we should likely remove that behaviour too.

One flexible option (with an eye on demo servers and www-sitemap-generator) is to skip the download only if postgres is in use (so we can do a similar this to here where we check for sqlite being in use):

bedrock/bin/cron.py

Lines 160 to 172 in 25ff14b

if not LOCAL_DB_UPDATE:
@scheduled_job("interval", minutes=DB_UPDATE_MINUTES)
def download_database():
command = "python bin/run-db-download.py"
if DB_DOWNLOAD_IGNORE_GIT:
command += " --ignore-git"
try:
check_call(command, shell=True, timeout=TIMEOUT_SECS)
except Exception as ex:
logging.error(ex)
raise

@stevejalim
Copy link
Collaborator Author

cc @pmac for thoughts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant