Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health Report Probe Docs: Up #16413

Open
yaauie opened this issue Sep 3, 2024 · 0 comments
Open

Health Report Probe Docs: Up #16413

yaauie opened this issue Sep 3, 2024 · 0 comments

Comments

@yaauie
Copy link
Member

yaauie commented Sep 3, 2024

As a part of the #16056 Health Report API, we will be introducing a per-pipeline probe called "pipeline/up" that is capable of diagnosing a pipeline based on its run-state.

Each diagnosis will need to provide a help_url link to ${major}.${minor}-specific documentation for that diagnosis. The following is meant to capture a high-level starting point for each of the probe's possible diagnoses.

running:
  status: green
loading:
  status: yellow
  diagnosis:
    cause: the pipeline is loading
    action: wait for the pipeline to finish starting
    help_url: "https://www.elastic.co/guide/en/logstash/${major}.${minor}/health_report/pipeline-up.html#loading"
    help_text: >
      Before a pipeline can consume and process events, its definition is loaded and compiled into an executable form.
      Depending on the specific shape of the pipeline, it may also execute pre-flight checks to ensure that the events it consumes will have a safe path to their destination.
      Pipelines typically remain in the "loading" state for only a few seconds at a time, so if your pipeline is taking significantly longer to load, you may need to dig into the pipeline's logs to determine if there is something wrong.
completed:
  status: yellow
  diagnosis:
    cause: the pipeline has completed normally
    action: if you expect the pipeline to run continuously, reconfigure configure its inputs and reload the pipeline
    help_url: "https://www.elastic.co/guide/en/logstash/${major}.${minor}/health_report/pipeline-up.html#completed"
    help_text: >
      While many pipelines are designed to run indefinitely, some input plugins can be configured to run just once and will close themselves when their work is complete.
      When _all_ of a pipeline's inputs have been closed, its workers finish processing events and exit.
      A pipeline in this state has completed its work normally, has no more work to do, and is no longer running.
errored:
  status: red
  diagnosis:
    cause: the pipeline has terminated due to an error
    action: inspect logs to determine the cause of the pipeline crash; if reloading is enabled, change the pipeline's definition to trigger a reload.
    help_url: "https://www.elastic.co/guide/en/logstash/${major}.${minor}/health_report/pipeline-up.html#errored"
    help_text: >
      The pipeline has crashed, and is no longer processing events.
      This is an error condition that needs to be manually resolved.
      Check the logs for to determine the cause of the crash.
      If pipeline reloading is enabled, change the pipeline's definition to trigger a reload.
unknown:
  status: unknown
  diagnosis:
    cause: the pipeline's run-state is unknown, likely because it was recently deleted.
    action: if you expect the pipeline to be running, ensure that it still exists in the pipeline config source (e.g., `config/pipelines.yml`, Kibana Central Management)
    help_url: "https://www.elastic.co/guide/en/logstash/${major}.${minor}/health_report/pipeline-up.html#unknown"
    help_text: >
      In logstash pipelines that rely on pipeline reloads, removing a pipeline from the config source causes it to be stopped and subsequently deleted.
      There is a short window during the deletion of a pipeline that it can continue to show up in the health report API, event though its state can no longer be determined.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant