Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POP-7251] Add entrypoint for using env vars to init dbt profile #2

Conversation

MasterOdin
Copy link
Contributor

PR modifies our images such that we now support the following three environment variables:

  • DBT_PROFILES -> /.dbt/profiles.yml
  • AWS_CREDENTIALS -> /.dbt/aws_credentials
  • BQ_KEYFILE -> /.dbt/bq_keyfile.json

If the environment variable is not set, then we'll fallback to seeing if we can find the files as mounted (e.g. /home/dbt/.dbt/profiles.yml). We use the custom directory (with appropriate env vars set so dbt/aws look there) to host these files to allow easy migration from path based mounting to env vars without worrying about overwriting anything if for a period of time both are used.

@linear
Copy link

linear bot commented Oct 2, 2023

POP-7251 Add entrypoint to handle env vars setup dbt container profiles.yml

What

We want to allow passing an environment variable to our dbt containers, and then the container's entrypoint will write those to the requisite files. This will allow us to more easily dynamically create/run the containers across different targets, without having to juggle profiles.yml files, ensuring that we don't overwrite a file in use by another container, cleaning up the files after the container exits, etc.

docker build \
--build-arg REQUIREMENTS_FILE="${requirements_file}" \
--tag "ghcr.io/popsql/${image_name}:${version}" \
--build-arg DBT_VERSION="${version}" \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this was forgotten as part of f83c4f0, adding it here as I was using this script to test the entrypoint.

"${BASE_DIR}"

echo "Successfully built image ${tag}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the echo to just make copy/paste of the tag name generated from this script easier


if [ -n "${BQ_KEYFILE}" ]; then
echo "${BQ_KEYFILE}" > /.dbt/bq_keyfile.json
fi
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No fallback for BQ as the reference to it is in profiles.yml and so up to the user to make sure the path is correct to a file that exists within the container, be it via mounting or this env var.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine. The ENVs should be passed together (either both or neither). If the ENVs are passed, we can assume they agree about the location. Otherwise, we can assume the mounted files agree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, simple enough to only prefer one or the other, not a combination. Done in 7fd3e54.

@@ -1,5 +1,8 @@
FROM python:3.8-slim-bullseye

ENV DBT_PROFILES_DIR=/.dbt
ENV AWS_SHARED_CREDENTIALS_FILE=/.dbt/aws_credentials
Copy link
Contributor Author

@MasterOdin MasterOdin Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This env var is documented as being supported by both aws cli as well as boto3 (what underlies the athena adapter).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supported by the Redshift adapter, too, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Believe so. Looks like the adapter relies on boto3 as well as the redshift_connector relies on it.

Copy link

@KevinKibler KevinKibler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally with postgres, bigquery+oauth, and redshift+iam

@MasterOdin MasterOdin merged commit e9d5e2d into main Oct 3, 2023
2 checks passed
@MasterOdin MasterOdin deleted the matt/pop-7251-add-entrypoint-to-handle-env-vars-setup-dbt-container branch October 3, 2023 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants