Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition between leader-elected and opensearch-client-created events. #473

Open
phvalguima opened this issue Oct 11, 2024 · 1 comment · May be fixed by #474
Open

Race condition between leader-elected and opensearch-client-created events. #473

phvalguima opened this issue Oct 11, 2024 · 1 comment · May be fixed by #474
Labels
bug Something isn't working

Comments

@phvalguima
Copy link
Contributor

phvalguima commented Oct 11, 2024

The OpenSearch deployment gets stuck with the following error message:

opensearch                         blocked      3  opensearch                2/edge         171  no       failed to create datahub_index index - deferring index-requested event...

That blocked only happens if the index already exists, as the code shows.

After doing some investigation on data-hub's CI failure: https://github.com/canonical/datahub-k8s-operator/actions/runs/11261394134/job/31314738969?pr=5#step:13:1786

I've noticed that we may have the following sequence of events happening:

  1. unit 1 is the leader, executes the index-requested event and creates the event
  2. leader ship changes -> unit 2
  3. unit 2 executes the index-requested event and notices the index already exists

Now, the cluster will be stuck on "failed to create {index-name} index".

It is not possible to validate the flow above as the real cause of this CI failure. Nonetheless, it is a possibility and therefore, we should avoid it.

@phvalguima phvalguima added the bug Something isn't working label Oct 11, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-5657.

This message was autogenerated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant