-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface and static route cleanup on transient launch failure #150
Comments
@flaminidavid I wonder, you mentioned that the image is not available for some time, but the output shows the registry is being used, and not a local daemon store. Don't you have images pushed in your private registry ahead of time before launching the lab? @carlmontanari I wonder if regardless of the above comment all we need to do is call |
then the launcher should be crashing. it seems you are execing onto the launcher to test stuff which is cool, but not how this works "irl". in your case it seems maybe you run stuff manually then re-launching after the image is ready or something (lots of assumptions here obv shout if they are incorrect!). I think that none of this is a problem in the normal flow as the launcher would always crash after any failure then we'd have a fresh container to try again. Guses we can add the -c flag if it matters but I dont think it "should". |
Sorry folks, I was out.
I agree, I should. I'm probably abusing the launcher's ability to retry here.
No, it's just a workaround for
Looks like the launcher container itself doesn't crash, only the router container inside does. Those static routes and bridges live in the launcher and stick around after the router container failure. |
can you share logs of that? if clab exits non-zero it should crash, if the container starts then goes away, it should crash. if you were on the pod (like made entry point sleep or something and exec'd on) then it would just exit but not crash since then clabernetes is not the entry point. tl;dr yeah if it doesn't it should so logs should help pin point where, but it really should be crashing properly hah. I hope :D |
unsure if it will be the "fix" but am adding the -c in #159 just in case. will prolly cut 0.1.2 today so then maybe can test that and let us know if it sorts it! |
Thanks! Upgraded today to 0.1.3, it did not fix it. I'll test passing Docker config in a bit, which should be enough for my use case. I checked that the launcher is using 0.1.3:
|
@flaminidavid what does |
|
In other news, overriding Docker config via |
damn haha ok cool thanks a bunch for updating us all. will prolly futz around with this this weekend. I think I accidentally encountered the same issue (recreating it may be another story!) so can for sure see if we can fix... will keep ya posted! |
The cleanup issue mentioned in #143 is actually impacting external connectivity. For example, if the launcher fails to bring up the router container because the container image is not (yet) available it will leave static routes and bridges behind. This results in the kernel resolving to the wrong bridge interface to reach the container.
The text was updated successfully, but these errors were encountered: