Fail a run due to step command failure #22

samrose · 2021-12-16T18:38:08Z

A feature that we still need to account for is to respond to the stdout, stderr status returned by Rambo, so that if one pipeline depends on another, but the first has failed with status 1, then the run of pipelines should stop

example

Rambo.run("mix", "compile",log: &IO.inspect/1)
{:ok, %Rambo{err: "", out: "", status: 0}}
iex(5)> Rambo.run("mix", "compil0",log: &IO.inspect/1)
{:stderr, "** (Mix) The task \"compil0\" could not be found. Did you mean \"co"}
{:stderr, "mpile\"?\n"}
{:error,
 %Rambo{
   err: "** (Mix) The task \"compil0\" could not be found. Did you mean \"compile\"?\n",
   out: "",
   status: 1
 }}

Right now in a test ei.json if the 2nd commend is put into the ei.json as a step in a pipeline, and the command fails with status 1, then ex_integrate still proceeds with the rest of the run.

We should just react to all the possible OS or Command errors or feedback from Rambo as one handling (the running of steps) or we could set a different error if ex_integrate errors itself. But any error for the Rambo-originated errors just stops the build.

The text was updated successfully, but these errors were encountered:

garrettmichaelgeorge · 2021-12-21T14:25:30Z

@samrose yes agreed. Currently, a full run using an ei.json will not halt on Rambo failure – we need to implement this feature.

FWIW what is possible today is:

StepRunner catches Rambo errors and returns an {:error, failed_step} tuple.

ex_integrate/lib/ex_integrate/step_runner.ex

Lines 9 to 12 in bb202a0

    
           case Rambo.run(command_path, step.args, log: log) do 
        
             {:ok, results} -> {:ok, save_results(step, results)} 
        
             {:error, results} -> {:error, save_results(step, results)} 
        
           end

PipelineRunner matches on the error response from StepRunner and stops the running of steps immediately upon step failure.

ex_integrate/lib/ex_integrate/pipeline_runner.ex

Lines 63 to 73 in bb202a0

    
           @impl GenServer 
        
           def handle_info({_ref, {:error, step}}, {state, config}) do 
        
             Logger.info("Step errored. #{inspect(step)}") 
        
             {:stop, :step_failure, state} 
        
           end 
        
           @impl GenServer 
        
           def handle_info({:DOWN, ref, _, _, reason}, {state, config}) do 
        
             Logger.info("Step task #{inspect(ref)} terminated unexpectedly. Reason: #{inspect(reason)}") 
        
             {:stop, :step_failure, state} 
        
           end

So I believe what is left is the OTP implementation to manage the full run, prevent a failed pipelines's dependent pipelines from running, and log the error or else respond to it at the highest level, e.g. for end user feedback.

samrose · 2021-12-21T14:41:53Z

@garrettmichaelgeorge good assessment thank you where you wrote

else respond to it at the highest level, e.g. for end user feedback.

I recommend we put that last piece only on the back burner, and return to it, because the rest of it is a complete basic system from the OTP step runner perspective I believe

garrettmichaelgeorge · 2021-12-30T15:37:31Z

@samrose yes that makes sense – the end user part probably belongs in the interface.

So I guess we have to define the boundary – what, exactly, should ExIntegrate do at the highest level if a run is unsuccessful?

I can think of two things: signal, somehow, that the run was unsuccessful, and provide a way to obtain the results of the run, including the full logs of each command output. A hypothetical interface, e.g. ExIntegrateWeb, could then detect the result of the run and notify GitHub or some other service, or else display the results on its own web interface.

As for signaling the completion of the run, I imagine something like throwing a nonzero exit status (communicating outside of the BEAM), or responding with an error tuple (within the BEAM). Or would other applications (e.g. ExIntegrateWeb) subscribe to ExIntegrate and receive notifications upon completion?

As for sharing the run results, I suppose this connects to our discussion about persistence. I recall we have considered at different times both a graph DB (i.e. rocksdb) that could store the Elixir data structure, as well as object storage for the logs. Would we use both? Or one or the other? How would an interface application access the results?

garrettmichaelgeorge · 2021-12-30T17:02:45Z

As for sharing the run results, I suppose this connects to our discussion about persistence. I recall we have considered at different times both a graph DB (i.e. rocksdb) that could store the Elixir data structure, as well as object storage for the logs. Would we use both? Or one or the other? How would an interface application access the results?

Related to #30.

samrose added the enhancement New feature or request label Dec 16, 2021

samrose assigned samrose and garrettmichaelgeorge Dec 16, 2021

samrose mentioned this issue Dec 16, 2021

(Outdated) Feature: MVP requirements #11

Closed

samrose added this to the MVP features are complete milestone Dec 16, 2021

garrettmichaelgeorge mentioned this issue Dec 21, 2021

Implement concurrent runs #25

Closed

samrose added this to https://github.com/ex_integrate Dec 30, 2021

samrose moved this to Todo in https://github.com/ex_integrate Dec 30, 2021

garrettmichaelgeorge changed the title ~~Stopping run based on the error and stdout/stderr state of a step~~ Fail a run based on the error and stdout/stderr state of a step Jan 29, 2022

garrettmichaelgeorge changed the title ~~Fail a run based on the error and stdout/stderr state of a step~~ Fail a run due to step command failure Jan 29, 2022

garrettmichaelgeorge moved this from Todo to In Progress in https://github.com/ex_integrate Jan 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail a run due to step command failure #22

Fail a run due to step command failure #22

samrose commented Dec 16, 2021

garrettmichaelgeorge commented Dec 21, 2021 •

edited

Loading

samrose commented Dec 21, 2021

garrettmichaelgeorge commented Dec 30, 2021

garrettmichaelgeorge commented Dec 30, 2021

Fail a run due to step command failure #22

Fail a run due to step command failure #22

Comments

samrose commented Dec 16, 2021

garrettmichaelgeorge commented Dec 21, 2021 • edited Loading

samrose commented Dec 21, 2021

garrettmichaelgeorge commented Dec 30, 2021

garrettmichaelgeorge commented Dec 30, 2021

garrettmichaelgeorge commented Dec 21, 2021 •

edited

Loading