Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip checkout when there are parse errors #275

Conversation

triarius
Copy link
Contributor

@triarius triarius commented Mar 14, 2024

When the controller attempts to parse the Kubernetes plugin config and encounters an error, it will run a "failure job" that echos the error message as the command so that it is surfaced in the job logs. However, for private repositories that need secrets for checkout, this will fail because the container is not given access to those secrets.

Rather than give the container access to the secrets, in this PR, we skip the checkout step entirely when there are such errors: it is not needed and can only slow down the job or cause some other error.

The error message container is run with the default agent image, which is public, so it should not need any more resources like imagePullSecrets to run. I guess this means that it won't necessarily work on k8s clusters that are air-gapped, but this image is needed for other containers like the agent start container and the init container, so no other jobs would work on such clusters anyway.

In any case, I don't think transferring things like gitEnvFrom and imagePullSecret are going to work if there was an error in the specification of those fields. We want this failure job to be as bare bones as possible to reduce the chance of errors in the Pipeline YAML from preventing it from running.

Fixes: #233, #273

Note to reviews

I've implemented this as a skipCheckout flag on the Build method. I'm aware that such practices don't align with our coding standards. But, I think the entire Build method needs to refactored, and I don't want to do that yet. There are a few PRs in flight (#262, #248, #258) that modify the Build method, so a significant refactor would introduce too many merge conflicts.

After they are merged, we should refactor this method into smaller, composable pieces so that when there is a parse error, the checkout container is omitted and an error message container is substituted for the job command containers.

@triarius triarius requested review from a team March 14, 2024 23:02
@@ -13,6 +13,5 @@ export IMAGE

gotestsum --junitfile "junit-${BUILDKITE_JOB_ID}.xml" -- \
-count=1 \
-failfast \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to see all the failures at once.

@triarius triarius force-pushed the pdp-2023-underlying-parse-error-does-not-print-if-using-gitenvfrom branch from e17ed89 to 44c8310 Compare March 14, 2024 23:48
Previously, the agent image configured in the controller's config was
used for the BuildFailureJob. But customers could have configured this
to be a private image. We don't want to copy the imagePullSecrets in as
there could be an error in their formatting, and the failure job is
designed to bubble up such errors to the job logs. So we use the default
agent image which is guaranteed to be public.
Copy link
Contributor

@moskyb moskyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤙 send it

@triarius triarius merged commit 97640fd into main Mar 20, 2024
1 check passed
@triarius triarius deleted the pdp-2023-underlying-parse-error-does-not-print-if-using-gitenvfrom branch March 20, 2024 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Underlying parse error does not print if using gitEnvFrom for Git credentials
2 participants