Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful Failure #70

Open
Maxusmusti opened this issue May 22, 2023 · 3 comments
Open

Graceful Failure #70

Maxusmusti opened this issue May 22, 2023 · 3 comments
Assignees

Comments

@Maxusmusti
Copy link
Collaborator

When a cloud provider fails to provision a machine, InstaScale should be able to gracefully handle: either by removing the failed machine and re-trying, or by scaling down the whole request if incomplete (or both, with a configurable timeout).

@Bobbins228 Bobbins228 self-assigned this Jul 10, 2023
@Bobbins228 Bobbins228 moved this from In Progress to Todo in Project CodeFlare Sprint Board Jul 13, 2023
@Bobbins228 Bobbins228 moved this from Todo to In Progress in Project CodeFlare Sprint Board Jul 17, 2023
@Bobbins228
Copy link
Contributor

@Maxusmusti I have been trying to trigger a failure phase in a machine but I have not been able to come across a way to do this. Even trying to scale machines which are not available to our org didn't work.

Have you any idea on how to trigger this scenario? Also how often would a failure like this occur?

@dimakis
Copy link
Contributor

dimakis commented Nov 13, 2023

@Bobbins228 did you ever figure this out?

@Bobbins228
Copy link
Contributor

I thought I had made a comment here but yeah if I remember correctly you have to create a machineSet template with an incorrect subnet attached to it which will cause a Node to enter that failed phase when machines are scaled up. That way you can replicate this behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

3 participants