-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ci): revert *_PKG_FILES
on windows argoexec
build
#13568
Conversation
Signed-off-by: Alan Clucas <[email protected]>
38c8c01
to
e33f811
Compare
argoexec
release build
Hmm my read of the error message was that To be fair, the symlinking part of the build is potentially one of the weirdest pieces and can be problematic in other ways (e.g. parallelized operations per #12583 (comment)). I've been meaning to sit down with it and see if it can be refactored out. |
I chose this way of doing it as expedient rather than perhaps the best choice. Potentially windows could be detected and mklink could be invoked. Docker under windows in a VM doesn't work for me (nested VMs) right now to play with this in a sane way. |
argoexec
release build*_PKG_FILES
on windows argoexec
build
CLI_PKG_FILES := $(shell go list -f '{{ join .Deps "\n" }}' ./cmd/argo/ | grep 'argoproj/argo-workflows/v3/' | xargs go list -f '{{ range $$file := .GoFiles }}{{ print $$.ImportPath "/" $$file "\n" }}{{ end }}' | cut -c 39-) | ||
CONTROLLER_PKG_FILES := $(shell go list -f '{{ join .Deps "\n" }}' ./cmd/workflow-controller/ | grep 'argoproj/argo-workflows/v3/' | xargs go list -f '{{ range $$file := .GoFiles }}{{ print $$.ImportPath "/" $$file "\n" }}{{ end }}' | cut -c 39-) | ||
HACK_PKG_FILES_AS_PKGS ?= false | ||
ifeq ($(HACK_PKG_FILES_AS_PKGS),false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to detect Windows instead of having the Windows Dockerfile use an env var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can detect Windows and do this the long way around we could attempt a mklink there too.
If I get access to some kind of windows that let's me play I might have a go, but I'd rather right now it was awkward and people who can fix it found this and tried to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can definitely detect Windows, in the same way you can detect a Linux OS, e.g. with uname
or $OSTYPE
. There are some variations on that (i.e. prior to WSL etc when Cygwin and MinGW were more used), but technically the main one we need to handle is the GH + Docker one, the rest a Windows contributor could figure out
We do have contributors on Windows also, including Tianchu, per #12603. That issue also involves a symlink file in the codebase (different, but also symlink related). But yes I agree that reverting to the old behavior isn't that harmful and is expedient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to fix releases in the interim (as it is Friday and after hours in Alan's time). Can make more changes in further PRs
Thanks for rigorously testing this in your fork Alan!
I was thinking about this too, this would require us to build the images for multiple architectures in CI, which will take more time (substantially so, the cross-compilation is quite slow). We could do that if we make the other steps not wait on it I suppose. There's also other steps of the release process that aren't tested in CI, such as the manifest generation and signing, which did cause some problems before, e.g. #13424 (comment) / #13155 |
This one is an easy parallel step, which will only slightly damage the number of server hamsters each build uses, so shouldn't impact too much. |
The E2E tests depend on the image build, so the longest build of those will cause the longest delay. I pulled the numbers in #12449 (comment) and it's an order of magnitude longer to run all the cross-compilation for the image (~2min -> ~20min in best case. also the CLI release has even more cross-compilation that takes long too, but seems to be faster than an image build). That's why I said that if we wanted to do this, it would have to be a separate, non-dependent step. But also there's a limit to the amount of parallelism GH allows per account/org, so the more we add, the more likely we are to hit that limit and cause things to become serial as a result. We do seem to already hit this occasionally if we have many simultaneous PRs or if CD is doing a release (the GH jobs will be pending for a while before starting). Also not sure if you missed it, but I did leave one comment on this PR specifically |
Fixes #13562
Motivation
The windows build in the release.yaml CI was broken by #13145, we need releases to work so this hack fixes it.
(CI should probably prove this is buildable rather than waiting for release to run.)
Modifications
In #13145 the dependencies for the binaries were fixed to be technically correct - as in the binaries depended upon the .go files that they were built from.
This had the side effect of causing
pkg/apis/workflow/v1alpha1/openapi_generated.go
to get rebuilt when buildingdist/argoexec
. This is technically correct behaviour.This fails when building on windows because windows has stupid limitations. This means that the Makefile step which trys to create a symlink fails with
ln: failed to create symbolic link 'v3' -> '.': File name too long
.Attempting to enable
LongPathsEnabled
(What year is it?) doesn't help either inside or outside the container run.So this PR runs the windows build with a hack reverting the dependencies when building the windows
dist/argoexec
only via a Makefile variableHACK_PKG_FILES_AS_PKGS
.I consider this "fine" as we don't directly support development on Windows, and the reverted behaviour there if anyone really wants to use it isn't that terrible (as in people lived with it for years before #13145 fixed it)
Verification
Verified here