Feature: Allow node downtime during the simulation #110
Labels
Robustness
Changes that improve the robustness of running the simulator
simulationerrors
Handling of errors than happen within the simulation.
This issue is similar to #32.
What do we do when an individual node that we have execution permissions on goes down?
Right now, the simplest approach is just to kill the simulation.
However, there are likely some simulation scenarios where it's useful to be able to take nodes offline for a while and bring them back online. Likewise, it can be useful to continue to run the simulation with a subset of the nodes that are working (for example, on a signet where people shut their laptops and kill their nodes).
Rather than dying, we can add a
--keep-alive
option to the simulation which would allow us to stay up for as long as any activity is successfully executing and die when we no longer have any actions that are working.The text was updated successfully, but these errors were encountered: